Efficient Recovery from Repeated Domain Shifts in Streaming Data

dc.contributor.advisorOates, Tim
dc.contributor.authorGandhewar, Richa Rajendra
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2019-10-11T13:39:17Z
dc.date.available2019-10-11T13:39:17Z
dc.date.issued2016-01-01
dc.description.abstractHumans have a remarkable ability to learn how to learn, what to learn, and when to learn. We are able to assess the utility of learned knowledge to achieve an objective and adapt our learning strategies accordingly. Likewise, we want machine learning systems trained in one domain to adapt well to different domains. If a classifier system encounters a distribution which it has seen previously, it should remember the previously learned knowledge and classify accordingly. This theses addresses the problem of recovering efficiently from repeated domain shifts in streaming data for a classifier system. This problem can be divided into two sub-problems. The first sub-problem is detecting a domain shift in a data stream representing learned knowledge. Like (Dredze, Oates, & Piatko 2010), we also use the A-distance (Kifer, Ben-David, & Gehrke 2004) over the absolute value of classification margin of support vector machines for this task. The second sub-problem is deciding what action to take after a domain shift is detected. We propose and evaluate approaches to training new models and deciding when to reuse old models to minimize cost and maximize accuracy in the face of repeated domain shifts. We use the Amazon product reviews dataset for evaluating our algorithm.
dc.genretheses
dc.identifierdoi:10.13016/m2lgb9-xy61
dc.identifier.other11482
dc.identifier.urihttp://hdl.handle.net/11603/15474
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Gandhewar_umbc_0434M_11482.pdf
dc.titleEfficient Recovery from Repeated Domain Shifts in Streaming Data
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gandhewar_umbc_0434M_11482.pdf
Size:
886.68 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Gandhewar_Efficient_Open.pdf
Size:
75.96 KB
Format:
Adobe Portable Document Format
Description: