Secure, Reproducible And Adaptive Machine Learning In Distributed Systems

dc.contributor.advisorWang, Jianwu
dc.contributor.advisorDuan, Sisi
dc.contributor.authorWang, Xin
dc.contributor.departmentInformation Systems
dc.contributor.programInformation Systems
dc.date.accessioned2023-07-07T16:02:11Z
dc.date.available2023-07-07T16:02:11Z
dc.date.issued2022-01-01
dc.description.abstractDistributed machine learning allows multiple workers to focus a same machine learning task collaboratively. Motivated by the explosive growth in scientific and industry technical researches about distributed systems and machine learning, the dissertation explores three main topics of machine learning in distributed systems, includes security, reproducibility and adaptivity. For tackling the challenge of security in distributed machine learning, we present an intrusion-tolerant federated learning architecture namely ReDeF. On the parameter server side, ReDeF utilizes Byzantine fault-tolerant protocol to enhance the reliability of the service. On the worker side, we study a new failure model called fully corrupted workers that capture scenarios that are never considered before. To handle fully corrupted workers, we propose a worker centric approach, allowing workers to collaboratively decide how the global model evolves. To support the approach, we also propose a new data structure called the global model tree, allowing workers to track the evolution of global models. For reproducibility in distributed machine learning with cloud, in order to achieve automate end-to-end execution of analytics and solve the vendor lock-in problem, we propose and develop an open-source toolkit that supports 1) fully automated end-to-end execution and reproduction via a single command, 2) automatic data and configuration storage for each execution, 3) flexible client modes based on user preferences, 4) execution history query, and 5) simple reproduction of existing executions in the same environment or a different environment. Different from traditional analytics that assumes data to be processed are available ahead of time and will not change, stream analytics deals with data that are being generated continuously and data distribution could change (aka concept drift), which will cause prediction/forecasting accuracy to drop over time. One other challenge is to find the best resource provisioning for stream analytics to achieve good overall latency. Based on these, we study how to best leverage edge and cloud resources to achieve better accuracy and latency for RNN-based stream analytics. We propose a novel edge-cloud integrated framework for hybrid stream analytics that support low latency inference on the edge and high capacity training on the cloud. We also study three flexible deployments of our hybrid learning framework. Further, our hybrid learning framework can dynamically combine inference results from an RNN model pre-trained based on historical data and another RNN model re-trained periodically based on the most recent data. Our evaluations and discussions in each chapter show that all the proposed solutions are effective and efficient for solving the security, reproducibility and adaptivity challenges respectively.
dc.formatapplication:pdf
dc.genredissertation
dc.identifierdoi:10.13016/m2lcpi-wo9n
dc.identifier.other12562
dc.identifier.urihttp://hdl.handle.net/11603/28462
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Wang_umbc_0434D_12562.pdf
dc.titleSecure, Reproducible And Adaptive Machine Learning In Distributed Systems
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Wang_umbc_0434D_12562.pdf
Size:
2.42 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Wang-Xin_Open.pdf
Size:
229.63 KB
Format:
Adobe Portable Document Format
Description: