Adaptive and Efficient Streaming Time Series Forecasting with Lambda Architecture and Spark

dc.contributor.authorPandya, Arjun
dc.contributor.authorOdunsi, Oluwatobiloba
dc.contributor.authorLiu, Chen
dc.contributor.authorCuzzocrea, Alfredo
dc.contributor.authorWang, Jianwu
dc.date.accessioned2022-09-29T14:07:57Z
dc.date.available2022-09-29T14:07:57Z
dc.date.issued2021-03-19
dc.description2020 IEEE International Conference on Big Data (Big Data)
dc.description.abstractThe rise of the Internet of Things (IoT) devices and the streaming platform has tremendously increased the data in motion or streaming data. It incorporates a wide variety of data, for example, social media posts, online gamers in-game activities, mobile or web application logs, online e-commerce transactions, financial trading, or geospatial services. Accurate and efficient forecasting based on real-time data is a critical part of the operation in areas like energy & utility consumption, healthcare, industrial production, supply chain, weather forecasting, financial trading, agriculture, etc. Statistical time series forecasting methods like Autoregression (AR), Autoregressive integrated moving average (ARIMA), and Vector Autoregression (VAR), face the challenge of concept drift in the streaming data, i.e., the properties of the stream may change over time. Another challenge is the efficiency of the system to update the Machine Learning (ML) models which are based on these algorithms to tackle the concept drift. In this paper, we propose a novel framework to tackle both of these challenges. The challenge of adaptability is addressed by applying the Lambda architecture to forecast future state based on three approaches simultaneously: batch (historic) data-based prediction, streaming (real-time) data-based prediction, and hybrid prediction by combining the first two. To address the challenge of efficiency, we implement a distributed VAR algorithm on top of the Apache Spark big data platform. To evaluate our framework, we conducted experiments on streaming time series forecasting with four types of data sets of experiments: data without drift (no drift), data with gradual drift, data with abrupt drift and data with mixed drift. The experiments show the differences of our three forecasting approaches in terms of accuracy and adaptability.en_US
dc.description.urihttps://ieeexplore.ieee.org/document/9377947en_US
dc.format.extent9 pagesen_US
dc.genreconference papers and proceedingsen_US
dc.genrepreprintsen_US
dc.genrecomputer codeen_US
dc.identifierdoi:10.13016/m2hwnr-6wsw
dc.identifier.citationA. Pandya, O. Odunsi, C. Liu, A. Cuzzocrea and J. Wang, "Adaptive and Efficient Streaming Time Series Forecasting with Lambda Architecture and Spark," 2020 IEEE International Conference on Big Data (Big Data), 2020, pp. 5182-5190, doi: 10.1109/BigData50022.2020.9377947.en_US
dc.identifier.urihttps://doi.org/10.1109/BigData50022.2020.9377947
dc.identifier.urihttp://hdl.handle.net/11603/25923
dc.language.isoen_USen_US
dc.publisherIEEEen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rights© 2020 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
dc.subjectUMBC Big Data Analytics Laben_US
dc.subjectUMBC High Performance Computing Facility (HPCF)
dc.titleAdaptive and Efficient Streaming Time Series Forecasting with Lambda Architecture and Sparken_US
dc.typeTexten_US
dcterms.creatorhttps://orcid.org/0000-0002-9933-1170en_US

Files

Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
time-series-analysis-master.zip
Size:
14.9 MB
Format:
Unknown data format
Description:
Computer Code
Loading...
Thumbnail Image
Name:
2020 Adaptive and Efficient Streaming Time Series Forecasting with Lambda Architecture and Spark (Big Data 2020) Camera-Ready.pdf
Size:
1.17 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: