Benchmarking Parallel K-Means Cloud Type Clustering from Satellite Data
Collections
Metadata
Show full item recordAuthor/Creator
Date
2019-10-08Type of Work
27 pagesText
books chapters
Citation of Original Publication
Barajas, Carlos; Guo, Pei; Mukherjee, Lipi; Hoban, Susan; Wang, Jianwu; Jin, Daeho; Gangopadhyay, Aryya; Gobbert, Matthias K.; Benchmarking Parallel K-Means Cloud Type Clustering from Satellite Data; International Symposium on Benchmarking, Measuring and Optimization Journal; Benchmarking, Measuring, and Optimizing pp 248-260; https://link.springer.com/chapter/10.1007/978-3-030-32813-9_20#citeasRights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.Public Domain Mark 1.0
This work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
http://creativecommons.org/publicdomain/mark/1.0/
Abstract
The study of clouds, i.e., where they occur and what are their characteristics, plays a key role in the understanding of climate change. Clustering is a common machine learning technique used in atmospheric science to classify cloud types. Many parallelism techniques e.g., MPI, OpenMP and Spark, could achieve efficient and scalable clustering of large-scale satellite observation data. In order to understand their differences, this paper studies and compares three different approaches on parallel clustering of satellite observation data. Benchmarking experiments with k-means clustering are conducted with three parallelism techniques, namely OpenMP, OpenMP+MPI, and Spark, on a HPC cluster using
up to 16 nodes.
The following license files are associated with this item:
- Creative Commons