A Coreset Learning Reality Check

dc.contributor.authorLu, Fred
dc.contributor.authorRaff, Edward
dc.contributor.authorHolt, James
dc.date.accessioned2023-02-28T18:48:14Z
dc.date.available2023-02-28T18:48:14Z
dc.date.issued2023-01-15
dc.description.abstractSubsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets. In recent years, several works have proposed methods for subsampling rows from a data matrix while maintaining relevant information for classification. While these works are supported by theory and limited experiments, to date there has not been a comprehensive evaluation of these methods. In our work, we directly compare multiple methods for logistic regression drawn from the coreset and optimal subsampling literature and discover inconsistencies in their effectiveness. In many cases, methods do not outperform simple uniform subsampling.en
dc.description.urihttps://arxiv.org/abs/2301.06163en
dc.format.extent15 pagesen
dc.genrejournal articlesen
dc.genrepreprintsen
dc.identifierdoi:10.13016/m24uce-ea4k
dc.identifier.urihttps://doi.org/10.48550/arXiv.2301.06163
dc.identifier.urihttp://hdl.handle.net/11603/26902
dc.language.isoenen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.en
dc.titleA Coreset Learning Reality Checken
dc.typeTexten
dcterms.creatorhttps://orcid.org/0000-0002-9900-1972en

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2301.06163.pdf
Size:
993.5 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: