A Streaming Tensor Decomposition Analysis for Earth Science Informatics
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
Type of Work
Department
Computer Science and Electrical Engineering
Program
Computer Science
Citation of Original Publication
Rights
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.
Distribution Rights granted to UMBC by the author.
Subjects
Abstract
Today, many application domains from science, sports, social media, health give rise to streaming multi-way data that can be naturally represented and analyzed via tensors.
Time sequences produced from measurements at different locations, made from a variety
of platforms, over a wide range of events, lend themselves to important Tensor
decomposition (TD) applications. TD is any scheme for expressing a "data tensor" (Mway array or M-nodes) as a sequence of elementary outer product operations acting on
other, often simpler tensors. TDs have applications in data analysis, signal processing,
machine learning and data mining. We apply the Shaden Smith (SPLATT) algorithm, to
form a tensor decomposition that provides a leading coefficient for each outer product
term. The value of the leading coefficients for each outer product are analogous to the
eigenvalues in the singular value decomposition of any matrix. We computed TD for the
FROSTT streaming data test modules, streaming aerosol concentration profiling from a
network of ceilometers, weather research forecast model (WRF) output analysis and 40
years of hourly climate observation analysis. We applied TD to a data set of aerosol concentration used to predict air quality. We
implemented a multi-sensor ground-based observatory network consisting of three lidar
x
firing ceilometers distributed along a 650 km corridor along the east coast, which
provided near real-time, streaming of high-resolution aerosol concentration profiles from
the ground up to 15 km for a one-year period. Daily variations of the aerosol
concentration are used to determine the planetary boundary layer heights (PBLH). We
determined the Planetary Boundary Layer Height (PBLH) acquired from our observation
network in near real time and over 1-year. We applied TD to the WRF model simulated
outputs over the entire continental US to study the time dependence of the dominant
components of PBLH. Results obtained by TD were compared with the ceilometer
observations as an accuracy assessment of model generated PBLH. A second application
of TD was applied to ERA5, a global reanalysis of 40 years of atmospheric and model
data, that enabled the study of the dominant components of the PBLH on a planetary
scale. We further applied TD to global warming data over the entire 40 years. Finally, we
examine the power spectral distribution of the leading coefficients associated with the
maximum tensor rank of the PBLH and the surface wind speeds for any similarities to
fluid turbulence power laws.
