Evaluation of Traditional and Deep Clustering Algorithms for Multivariate Spatio-Temporal Data
Files
Links to Files
Permanent Link
Collections
UMBC Student Collection
iHARP: NSF HDR Institute for Harnessing Data and Model Revolution in the Polar Regions
UMBC Center for Accelerated Real Time Analysis
UMBC Center for Real-time Distributed Sensing and Autonomy
UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
Load more iHARP: NSF HDR Institute for Harnessing Data and Model Revolution in the Polar Regions
UMBC Center for Accelerated Real Time Analysis
UMBC Center for Real-time Distributed Sensing and Autonomy
UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
Author/Creator ORCID
Date
Type of Work
Department
Program
Citation of Original Publication
Nji, Ndikum Francis, Rohan Mandar Salvi, Sai Sri Ram Kuram Tirumala, Jianwu Wang, and Xue Zheng. “Evaluation of Traditional and Deep Clustering Algorithms for Multivariate Spatio-Temporal Data.” Lawrence Livermore National Laboratory, October 28, 2024.
Rights
This work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
Public Domain
Public Domain
Subjects
Abstract
Spatiotemporal data is commonly available in many disciplines such as atmospheric science, Earth sciences and environment science, and data is generated by monitoring a certain area over a period of time. Analyzing such high-dimensional data is critical for uncovering hidden patterns and one important approach is to categorize it along the temporal dimension into smaller groups. While classical methods like K-means and Gaussian Mixture Models (GMM) are favored for their simplicity and interpretability, they encounter challenges in modeling complex, high-dimensional relationships inherent in nonlinear spatiotemporal data. In contrast, deep clustering algorithms that combine neural networks with unsupervised learning objectives excel by learning latent representations that better capture nonlinear spatiotemporal dependencies. This study provides a rigorous evaluation of both traditional and deep clustering algorithms on high dimensional multivariate spatiotemporal climate datasets. Our comparative study examines the performance of these techniques across synthetic and real-world datasets, assessing clustering accuracy and stability. We emphasize the advantages of deep clustering, particularly in applications such as climate data analysis and traffic flow prediction, where mining and understanding nonlinear high-dimensional correlations are critical. The results demonstrate that while traditional clustering algorithms are effective for basic tasks, deep learning-based approaches outperform them in managing complex nonlinear patterns present in high dimensional multivariate spatiotemporal data.
