Common and Distinct Subspace Analysis in Data Fusion: Application to the Fusion of Brain Imaging Data
Loading...
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
2022-01-01
Type of Work
Department
Computer Science and Electrical Engineering
Program
Engineering, Electrical
Citation of Original Publication
Rights
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
Abstract
Data-driven methods, such as those based on independent component analysis (ICA), make very few assumptions on the data and the relationships of the datasets, and hence have been increasingly used for the fusion of multiple datasets across disciplines including in neuroscience. ICA enables unique decompositions under general conditions for a large class of signals, and independent components lend themselves to easier interpretation. However, traditional ICA-based fusion methods make simplifying assumptions that either limit their ability to allow full interaction among the datasets or reduce the flexibility of choosing appropriate algorithms and signal subspace order across different datasets. Moreover, most of the existing techniques do not identify the association structure of the components, i.e., components that are common and distinct across datasets, and make a priori assumptions for those. Hence, a significant challenge is the development of a fusion method that allows the datasets to fully interact and inform each other and estimate components that explain the common and distinct behavior of the datasets. An essential step following the estimation of neuroimaging fusion results is to find association with the non-imaging information such as cognitive or behavioral variables. This helps us to better understand and explain the evolution of neural and cognitive processes and predict outcomes for intervention and treatment. However, identifying such associations is challenging using current ICA-based methods due to differences between the imaging and non-imaging datasets in terms of their nature. In this dissertation, we address these challenges for data fusion. First, we develop two novel flexible fusion methods called consecutive independence and correlation transform (C-ICT) and disjoint subspace analysis using ICA (DS-ICA). These methods are developed based on ICA, and its extension to multiple datasets, independent vector analysis (IVA), to allow maximum interaction among the datasets and, at the same time, provide flexibility for the selection of different model orders and algorithms across the datasets. Second, we propose a new technique based on IVA and eigen-analysis, complete model identification using IVA (CMI-IVA), to estimate the full association structure of the components across datasets. Finally, we make use of two IVA-based algorithms, adaptively constrained IVA (ac-IVA) and IVA with Gaussian distribution (IVA-G), to identify the multivariate association between imaging and non-imaging components and use these two algorithms to develop a novel scheme to analyze imaging and non-imaging datasets jointly. We demonstrate superior performances of our proposed methods compared with traditional ICA-based techniques, first in simulations and then using real neuroimaging and behavioral data collected from healthy subjects and schizophrenia patients. With the fusion of real neuroimaging datasets, we show that our proposed methods can estimate more interpretable, i.e., physically meaningful, components and their association structure across different neuroimaging datasets. These components also show significant differences across healthy controls and patients with schizophrenia and can be used as putative biomarkers. Furthermore, when analyzed jointly with the behavioral variables, our methods can successfully identify the cognitive process related to the estimated neuroimaging components, thus explaining the relationship between neural and cognitive processes. Though the focus of this work is the analysis and fusion of neuroimaging data, these methods can also be applied to other fields of study where multiple related datasets are available.