A Comprehensive Guide to Multiset Canonical Correlation Analysis and its Application to Joint Blind Source Separation
Links to Files
Author/Creator
Author/Creator ORCID
Date
Type of Work
Department
Program
Citation of Original Publication
Lehmann, Isabell, Ben Gabrielson, Tanuj Hasija, and Tülay Adali. “A Comprehensive Guide to Multiset Canonical Correlation Analysis and Its Application to Joint Blind Source Separation.” IEEE Transactions on Signal Processing, (October 24, 2025): 1–16. https://doi.org/10.1109/TSP.2025.3623874.
Rights
Attribution 4.0 International
Subjects
Random variables
Symbols
joint blind source separation
source identification conditions
Feature extraction
Blind source separation
UMBC Machine Learning and Signal Processing Lab (MLSP-Lab)
Covariance matrices
multiset canonical correlation analysis
Linear programming
Correlation
generalized canonical correlation analysis
Vectors
UMBC Ebiquity Research Group
UMBC Machine Learning for Signal Processing Lab
Optimization
Reviews
Symbols
joint blind source separation
source identification conditions
Feature extraction
Blind source separation
UMBC Machine Learning and Signal Processing Lab (MLSP-Lab)
Covariance matrices
multiset canonical correlation analysis
Linear programming
Correlation
generalized canonical correlation analysis
Vectors
UMBC Ebiquity Research Group
UMBC Machine Learning for Signal Processing Lab
Optimization
Reviews
Abstract
Multiset Canonical Correlation Analysis (mCCA), also called Generalized Canonical Correlation Analysis (GCCA), is a technique to identify correlated variables across multiple datasets, which can be used for feature extraction in fields like neuroscience, cross-language information retrieval, and recommendation systems, among others. Besides its wide use, there is still a lack of comprehensive understanding of its theory and implementation with different objective functions all under one umbrella. In this paper, we review the five commonly used mCCA methods sumcor, maxvar, minvar, genvar, and ssqcor. We provide a concise overview of their optimization problems along with their solutions and pseudocodes. After this, we discuss the application of mCCA for estimating underlying latent components in the Joint Blind Source Separation (JBSS) problem and propose the source identification conditions of the different mCCA methods, i.e., the conditions under which they are able to achieve JBSS. We substantiate the proposed theoretical conditions with numerical results and test the statistical efficiency of the methods for finite samples. We observe in our experiments that genvar appears to have the least restrictive source identification conditions and to be more statistically efficient that the other methods. This suggests that genvar is generally the best-performing mCCA method for JBSS except for special cases, which is an important finding, as the most commonly used mCCA methods are maxvar and sumcor.
