LG AI MLOct 2, 2023

Unconstrained Stochastic CCA: Unifying Multiview and Self-Supervised Learning

James Chapman, Lennie Wells, Ana Lawry Aguila

arXiv:2310.01012v47.74 citationsh-index: 5Has Code

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in multiview and self-supervised learning by enabling scalable CCA applications, though it is incremental in improving existing methods.

The authors tackled the computational inefficiency of Canonical Correlation Analysis (CCA) methods for large-scale data by proposing a novel unconstrained objective and fast stochastic gradient descent algorithms, achieving far faster convergence and higher correlations than previous state-of-the-art on standard benchmarks, including a first-of-its-kind analysis of a large biomedical dataset with over 33,000 individuals and 500,000 features.

The Canonical Correlation Analysis (CCA) family of methods is foundational in multiview learning. Regularised linear CCA methods can be seen to generalise Partial Least Squares (PLS) and be unified with a Generalized Eigenvalue Problem (GEP) framework. However, classical algorithms for these linear methods are computationally infeasible for large-scale data. Extensions to Deep CCA show great promise, but current training procedures are slow and complicated. First we propose a novel unconstrained objective that characterizes the top subspace of GEPs. Our core contribution is a family of fast algorithms for stochastic PLS, stochastic CCA, and Deep CCA, simply obtained by applying stochastic gradient descent (SGD) to the corresponding CCA objectives. Our algorithms show far faster convergence and recover higher correlations than the previous state-of-the-art on all standard CCA and Deep CCA benchmarks. These improvements allow us to perform a first-of-its-kind PLS analysis of an extremely large biomedical dataset from the UK Biobank, with over 33,000 individuals and 500,000 features. Finally, we apply our algorithms to match the performance of `CCA-family' Self-Supervised Learning (SSL) methods on CIFAR-10 and CIFAR-100 with minimal hyper-parameter tuning, and also present theory to clarify the links between these methods and classical CCA, laying the groundwork for future insights.

View on arXiv PDF Code

Similar