Large-Scale Causality Discovery Analytics as a Service

dc.contributor.authorWang, Xin
dc.contributor.authorGuo, Pei
dc.contributor.authorWang, Jianwu
dc.date.accessioned2022-09-26T15:11:39Z
dc.date.available2022-09-26T15:11:39Z
dc.date.issued2022-01-13
dc.description2021 IEEE International Conference on Big Data (Big Data), 15-18 December 2021, Orlando, FL, USAen
dc.description.abstractData-driven causality discovery is a common way to understand causal relationships among different components of a system. We study how to achieve scalable data-driven causal- ity discovery on Amazon Web Services (AWS) and Microsoft Azure cloud and propose a causality discovery as a service (CDaaS) framework. With this framework, users can easily re- run previous causality discovery experiments or run causality discovery with different setups (such as new datasets or causality discovery parameters). Our CDaaS leverages Cloud Container Registry service and Virtual Machine service to achieve scal- able causality discovery with different discovery algorithms. We further did extensive experiments and benchmarking of our CDaaS to understand the effects of seven factors (big data engine parameter setting, virtual machine instance number, type, subtype, size, cloud service, cloud provider) and how to best provision cloud resources for our causality discovery service based on certain goals including execution time, budgetary cost and cost-performance ratio. We report our findings from the benchmarking, which can help obtain optimal configurations based on each application’s characteristics. The findings show proper configurations could lead to both faster execution time and less budgetary cost.en
dc.description.sponsorshipThis work is supported by grant CAREER: Big Data Cli- mate Causality Analytics (OAC–1942714) and grant Cyber- Training: DSE: Cross-Training of Researchers in Computing, Applied Mathematics and Atmospheric Sciences using Ad- vanced Cyberinfrastructure Resources (OAC–1730250) from the National Science Foundation.en
dc.description.urihttps://ieeexplore.ieee.org/document/9671373en
dc.format.extent11 pagesen
dc.genreconference papers and proceedingsen
dc.genrepreprintsen
dc.genrecomputer codeen
dc.identifierdoi:10.13016/m2qb37-0onz
dc.identifier.citationX. Wang, P. Guo and J. Wang, "Large-Scale Causality Discovery Analytics as a Service," 2021 IEEE International Conference on Big Data (Big Data), 2021, pp. 3130-3140, doi: 10.1109/BigData52589.2021.9671373.en
dc.identifier.urihttps://doi.org/10.1109/BigData52589.2021.9671373
dc.identifier.urihttp://hdl.handle.net/11603/25880
dc.language.isoenen
dc.publisherIEEEen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rights© 2021 IEEE.  Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en
dc.subjectUMBC Big Data Analytics Laben
dc.titleLarge-Scale Causality Discovery Analytics as a Serviceen
dc.typeTexten
dcterms.creatorhttps://orcid.org/0000-0002-9933-1170en

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
2021 Large-Scale Causality Discovery Analytics_as a Service.pdf
Size:
510 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Causality_Discovery_as_a_Service-main.zip
Size:
81.17 KB
Format:
Unknown data format
Description:
Code

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: