Optimization of the K-means Clustering Algorithm through Initialized Principal Direction Divisive Partitioning

dc.contributor.authorJames, Bruce
dc.date.accessioned2025-12-15T14:57:45Z
dc.date.issued2018
dc.description.abstractData clustering is invaluable to the automated analysis of large document sets. Documents are converted into vectors in a finite dimensional space, and the resulting collection of salient features is then processed through an algorithm of one's choice, such as the classic k-means clustering algorithm. Due to the size of the feature space, different algorithms offer a trade-off between accuracy and computational efficiency. This study investigates the Principal Direction Divisive Partitioning (PDDP) algorithm, described as a top-down hierarchical technique, as a plug-in to the k-means algorithm. K-means reliance on initial random partitioning builds computational cost into the analysis. Using a PDDP initialized partition to seed k-means, computational efficiency will be compared to a k-means trial without PDDP.
dc.description.urihttps://ur.umbc.edu/wp-content/uploads/sites/354/2019/05/umbc_review_2018_vol19.pdf#page=38
dc.format.extent17 pages
dc.genrejournal articles
dc.identifierdoi:10.13016/m2yecl-tseo
dc.identifier.citationJames, Bruce. “Optimization of the K-Means Clustering Algorithm through Initialized Principal Direction Divisive Partitioning.” UMBC Review: Journal of Undergraduate Research 19 (2018): 37–54. https://ur.umbc.edu/wp-content/uploads/sites/354/2019/05/umbc_review_2018_vol19.pdf#page=38
dc.identifier.urihttp://hdl.handle.net/11603/41114
dc.language.isoen
dc.publisherUniveristy of Maryland, Baltimore County
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Student Collection
dc.relation.ispartofUMBC Mathematics and Statistics Department
dc.relation.ispartofUMBC Review
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.titleOptimization of the K-means Clustering Algorithm through Initialized Principal Direction Divisive Partitioning
dc.typeText

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
24UMBCReview2018Volume19_Kmeans.pdf
Size:
810.4 KB
Format:
Adobe Portable Document Format