A Coreset Learning Reality Check

dc.contributor.authorLu, Fred
dc.contributor.authorRaff, Edward
dc.contributor.authorHolt, James
dc.date.accessioned2023-02-28T18:48:14Z
dc.date.available2023-02-28T18:48:14Z
dc.date.issued2023-01-15
dc.description.abstractSubsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets. In recent years, several works have proposed methods for subsampling rows from a data matrix while maintaining relevant information for classification. While these works are supported by theory and limited experiments, to date there has not been a comprehensive evaluation of these methods. In our work, we directly compare multiple methods for logistic regression drawn from the coreset and optimal subsampling literature and discover inconsistencies in their effectiveness. In many cases, methods do not outperform simple uniform subsampling.en_US
dc.description.urihttps://arxiv.org/abs/2301.06163en_US
dc.format.extent15 pagesen_US
dc.genrejournal articlesen_US
dc.genrepreprintsen_US
dc.identifierdoi:10.13016/m24uce-ea4k
dc.identifier.urihttps://doi.org/10.48550/arXiv.2301.06163
dc.identifier.urihttp://hdl.handle.net/11603/26902
dc.language.isoen_USen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.en_US
dc.titleA Coreset Learning Reality Checken_US
dc.typeTexten_US
dcterms.creatorhttps://orcid.org/0000-0002-9900-1972en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2301.06163.pdf
Size:
993.5 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: