Cluster Quality Analysis Using Silhouette Score
dc.contributor.author | Shahapure, Ketan Rajshekhar | |
dc.contributor.author | Nicholas, Charles | |
dc.date.accessioned | 2020-12-14T16:10:51Z | |
dc.date.available | 2020-12-14T16:10:51Z | |
dc.date.issued | 2020-11-20 | |
dc.description | 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 6-9 Oct. 2020, Sydney, Australia | en_US |
dc.description.abstract | Clustering is an important phase in data mining. Selecting the number of clusters in a clustering algorithm, e.g. choosing the best value of k in the various k-means algorithms [1], can be difficult. We studied the use of silhouette scores and scatter plots to suggest, and then validate, the number of clusters we specified in running the k-means clustering algorithm on two publicly available data sets. Scikit-learn's [4] silhouette score method, which is a measure of the quality of a cluster, was used to find the mean silhouette co-efficient of all the samples for different number of clusters. The highest silhouette score indicates the optimal number of clusters. We present several instances of utilizing the silhouette score to determine the best value of k for those data sets. | en_US |
dc.description.uri | https://ieeexplore.ieee.org/document/9260048/authors#authors | en_US |
dc.format.extent | 2 pages | en_US |
dc.genre | conference papers and proceedings postprints | en_US |
dc.identifier | doi:10.13016/m2zwe2-2w49 | |
dc.identifier.citation | K. R. Shahapure and C. Nicholas, "Cluster Quality Analysis Using Silhouette Score," 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), sydney, Australia, 2020, pp. 747-748, doi: 10.1109/DSAA49011.2020.00096. | en_US |
dc.identifier.uri | https://doi.org/10.1109/DSAA49011.2020.00096 | |
dc.identifier.uri | http://hdl.handle.net/11603/20251 | |
dc.language.iso | en_US | en_US |
dc.publisher | IEEE | en_US |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | |
dc.rights | © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | |
dc.title | Cluster Quality Analysis Using Silhouette Score | en_US |
dc.type | Text | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Cluster_Quality_Analysis_Using_Silhouette_Score_DSAA (2).pdf
- Size:
- 436.87 KB
- Format:
- Adobe Portable Document Format
- Description:
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 2.56 KB
- Format:
- Item-specific license agreed upon to submission
- Description: