Mode Coresets for Efficient, Interpretable Tensor Decompositions: An Application to Feature Selection in fMRI Analysis

Date

2024-12-13

Department

Program

Citation of Original Publication

Rights

Attribution 4.0 International

Abstract

Generalizations of matrix decompositions to multidimensional arrays, called tensor decompositions, are simple yet powerful methods for analyzing datasets in the form of tensors. These decompositions model a data tensor as a sum of rank-1 tensors, whose factors provide uses for a myriad of applications. Given the massive sizes of modern datasets, an important challenge is how well computational complexity scales with the data, balanced with how well decompositions approximate the data. Many efficient methods exploit a small subset of the tensor抯 elements, representing most of the tensor抯 variation via a basis over the subset. These methods� efficiencies are often due to their randomized natures; however, deterministic methods can provide better approximations, and can perform feature selection, highlighting a meaningful subset that well-represents the entire tensor. In this paper, we introduce an efficient subset-based form of the Tucker decomposition, by selecting coresets from the tensor modes such that the resulting core tensor can well-approximate the full tensor. Furthermore, our method enables a novel feature selection scheme unlike other methods for tensor data. We introduce methods for random and deterministic coresets, minimizing error via a measure of discrepancy between the coreset and full tensor. We perform the decompositions on simulated data, and perform on real-world fMRI data to demonstrate our method抯 feature selection ability. We demonstrate that compared with other similar decomposition methods, our methods can typically better approximate the tensor with comparably low computational complexities.