Scalable Multivariate Causality Discovery From Large-Scale Global Spatiotemporal Climate Data

Guo, Pei

Scalable Multivariate Causality Discovery From Large-Scale Global Spatiotemporal Climate Data

dc.contributor.advisor	Wang, Jianwu
dc.contributor.author	Guo, Pei
dc.contributor.department	Information Systems
dc.contributor.program	Information Systems
dc.date.accessioned	2022-02-09T15:52:47Z
dc.date.available	2022-02-09T15:52:47Z
dc.date.issued	2020-01-01
dc.description.abstract	The study of causality investigates cause-effect relationships among different variables of a system and has been widely researched in climatology. To discover causal relationships from time-series datasets, many data-driven causality discovery methods, e.g., Granger causality, PCMCI and Dynamic Bayesian Network, have been proposed. Most of the existing approaches face computing challenges when they are used to discover causality from the explosion of available data with increasing dimensionality. These causality discovery approaches mine time series data and generate a directed causality graph where each graph edge denotes a cause-effect relationship between the two connected graph nodes, yet their results differ from other algorithms in most cases. Furthermore, there is ever-increasing available climate data, which makes it more and more difficult to utilize existing causality discovery algorithms and technologies to generate causality results within reasonable time and budget. Three main challenges in discover causality from the large-scale and complex climate observation and simulation datasets are computing complexity, results uncertainty and reproducibility. To deal with computation complexity, we design and implement a new incremental parallel gradient boosting causality discovery method to address the challenge of learning non-linear and hybrid climate data with increasing data size and dimensionality. To deal with the challenge of uncertainty, which indicates the different results in various existing data-driven causality discovery methods, a hybrid model ensemble framework utilizing current existing data partitioning and ensemble techniques is proposed to generate more accurate and more stable results. Finally, to achieve reproducibility, we develop and deploy causality-as-a-service on AWS cloud for researchers to achieve user-friendly and budget-friendly when dealing with causality discovery on large-scale time-series data on cloud. Our experiments and results with both synthetic and real-world datasets show that the proposed methods are effective and efficient solutions to the challenges.
dc.format	application:pdf
dc.genre	dissertations
dc.identifier	doi:10.13016/m2oopr-ssqk
dc.identifier.other	12385
dc.identifier.uri	http://hdl.handle.net/11603/24198
dc.language	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Information Systems Department Collection
dc.relation.ispartof	UMBC Theses and Dissertations Collection
dc.relation.ispartof	UMBC Graduate School Collection
dc.relation.ispartof	UMBC Student Collection
dc.source	Original File Name: Guo_umbc_0434D_12385.pdf
dc.subject	Big data
dc.subject	Causality discovery
dc.subject	Machine learning
dc.subject	XaaS
dc.title	Scalable Multivariate Causality Discovery From Large-Scale Global Spatiotemporal Climate Data
dc.type	Text
dcterms.accessRights	Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
dcterms.accessRights	This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Guo_umbc_0434D_12385.pdf
Size:: 2.83 MB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Theses and Dissertations
UMBC Graduate School
UMBC Information Systems Department
UMBC Student Collection