Performance considerations for scalable parallel tensor decomposition

dc.contributor.authorRolinger, Thomas B.
dc.contributor.authorSimon, Tyler A.
dc.contributor.authorKrieger, Christopher D.
dc.date.accessioned2024-02-29T16:27:47Z
dc.date.available2024-02-29T16:27:47Z
dc.date.issued2019-04-26
dc.description.abstractTensor decomposition, the higher-order analogue to singular value decomposition, has emerged as a useful tool for finding relationships in large, sparse, multidimensional data. As this technique matures and is applied to increasingly larger data sets, the need for high performance implementations becomes critical. A better understanding of the performance characteristics of tensor decomposition on large and sparse tensors can help drive the development of such implementations. In this work, we perform an objective empirical evaluation of three state of the art parallel tools that implement the Canonical Decomposition/Parallel Factorization tensor decomposition algorithm using alternating least squares fitting (CP-ALS): SPLATT, DFacTo, and ENSIGN. We conduct performance studies across a variety of data sets and evaluate the tools with respect to total memory required, processor stall cycles, execution time, data distribution, and communication patterns. Furthermore, we investigate the performance of the implementations on tensors with up to 6 dimensions and when executing high rank decompositions. We find that tensor data structure layout and distribution choices can result in differences as large as 14.6x with respect to memory usage and 39.17x with respect to execution time. We provide an outline of a distributed heterogeneous CP-ALS implementation that addresses the performance issues we observe.
dc.description.urihttps://www.sciencedirect.com/science/article/pii/S0743731517302897
dc.format.extent24 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m23ujf-tldb
dc.identifier.citationRolinger, Thomas B., Tyler A. Simon, and Christopher D. Krieger. “Performance Considerations for Scalable Parallel Tensor Decomposition.” Journal of Parallel and Distributed Computing 129 (July 1, 2019): 83–98. https://doi.org/10.1016/j.jpdc.2017.10.013.
dc.identifier.urihttps://doi.org/10.1016/j.jpdc.2017.10.013
dc.identifier.urihttp://hdl.handle.net/11603/31744
dc.publisherElsevier
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.titlePerformance considerations for scalable parallel tensor decomposition
dc.typeText

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
JPDC-preprint.pdf
Size:
3.71 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: