Multiclass Imbalanced Learning in Ensembles through Selective Sampling

dc.contributor.advisorJaneja, Vandana P
dc.contributor.advisorLevin, Scott
dc.contributor.authorAzari, Ali
dc.contributor.departmentInformation Systems
dc.contributor.programInformation Systems
dc.date.accessioned2019-10-11T13:58:01Z
dc.date.available2019-10-11T13:58:01Z
dc.date.issued2015-01-01
dc.description.abstractImbalanced learning is the problem of learning from datasets when the class proportions are highly imbalanced. Imbalanced datasets are increasingly seen in many domains and pose a challenge to traditional classification techniques. Learning from imbalanced multiclass data (three or more classes) creates additional complexities. Studies suggest that ensemble learners can be trained to emphasize different segments of data pertaining to different classes and thereby produce more accurate results than regular imbalance learning techniques. Thus, we propose a new approach to building ensembles of classifiers for multiclass imbalanced datasets, called Multiclass Imbalance Learning in Ensembles through Selective Sampling (MILES). Each member of MILES is trained with the data selectively sampled from the bands around cluster centroids in a way that diversity is aggressively encouraged within the ensemble. Resampling techniques are utilized to balance the distribution of the data that comes from each cluster. We performed several experiments applying our approach to different real-word datasets demonstrating improved performance for recognizing minority class examples and balancing the G-mean and Mean Area Under the Curve (MAUC) across classes. We further applied MILES to classify prolonged emergency department (ED) stays with consistently higher performance as compared to existing ensemble methods.
dc.genredissertations
dc.identifierdoi:10.13016/m2xvc4-cn5k
dc.identifier.other11416
dc.identifier.urihttp://hdl.handle.net/11603/15613
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Azari_umbc_0434D_11416.pdf
dc.subjectclass imbalance learning
dc.subjectdata mining
dc.subjectEnsemble Learning
dc.subjecthealth informatics
dc.subjectmachine learning
dc.subjectmulticlass classification
dc.titleMulticlass Imbalanced Learning in Ensembles through Selective Sampling
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Azari_umbc_0434D_11416.pdf
Size:
3.1 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Azari_Multiclass_Open.pdf
Size:
45.25 KB
Format:
Adobe Portable Document Format
Description: