Data-Driven Techniques for Inference in Large-Scale fMRI Datasets: Homogeneous Subgroup Identification and Multi-Subject Analysis
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
Type of Work
Department
Computer Science and Electrical Engineering
Program
Engineering, Electrical
Citation of Original Publication
Rights
Distribution Rights granted to UMBC by the author.
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Abstract
The availability of large-scale, open-source neuroimaging datasets has significantly expanded opportunities for brain research. By jointly analyzing multisubject data from such repositories, researchers can draw group inferences across cohorts, enhance our understanding of brain function, and identify potential biomarkers or subtypes of different disorders. Additionally, large-scale datasets facilitate the detection of subtle effects that may not be statistically discernible in smaller cohorts. However, analyzing such data poses challenges due to high dimensionality, inter-subject variability, and the computational demands of existing methods, which grow with dataset size. While these frameworks are designed to effectively capture subject differences, their computational cost increases as the number of subjects grows. This dissertation addresses these challenges by developing data-driven techniques that efficiently analyze large-scale datasets, extract meaningful and reproducible features, and optimize computational performance. Using multi-subject resting-state fMRI (rs-fMRI) as a case study, we demonstrate the effectiveness of these methods in applications such as homogeneous subgroup identification and biomarker detection. Our proposed techniques preserve subject variability while maintaining computational efficiency, enabling the identification of clinically meaningful subgroups and biomarkers from various large psychiatric cohorts. We begin by introducing foundational concepts in fMRI data analysis and commonly used techniques such as blind source separation (BSS) and joint BSS (JBSS) methods. We provide studies of subgroup identification from multi-subject rs-fMRI data, highlighting the advantages of JBSS techniques in preserving subject variability. We propose to model the cross-functional network information as a multiplex network and enhance the subgroup identification performance by taking the multi-dimensional information into account. To address computational complexity limitations of the current JBSS methods, we develop methods that enhance computational efficiency while preserving subject variability. These techniques position data-driven and model-driven approaches as two ends of a spectrum, seeking an optimal balance in between by either flexible constraint selection schemes or a representative coreset strategy. Furthermore, we extend the coreset concept to higher-dimensional data, developing an efficient tensor-based method for complex fMRI research tasks such as dynamic functional network analysis over time. We conclude by summarizing our proposed large-scale data analysis techniques and providing guidelines for selecting appropriate methods based on specific research needs.
