LGIRSPMLMar 25, 2020

Generalized Canonical Correlation Analysis: A Subspace Intersection Approach

arXiv:2003.11205v140 citations
AI Analysis

This work addresses GCCA, a tool used in data mining and machine learning for finding correlated variables across multiple views, offering an incremental improvement with a new algebraic interpretation and algorithm.

The paper tackles the problem of Generalized Canonical Correlation Analysis (GCCA) by proposing a novel algebraic perspective based on subspace intersection, leading to a scalable algorithm that effectively handles large tasks, as demonstrated through synthetic and real data experiments.

Generalized Canonical Correlation Analysis (GCCA) is an important tool that finds numerous applications in data mining, machine learning, and artificial intelligence. It aims at finding `common' random variables that are strongly correlated across multiple feature representations (views) of the same set of entities. CCA and to a lesser extent GCCA have been studied from the statistical and algorithmic points of view, but not as much from the standpoint of linear algebra. This paper offers a fresh algebraic perspective of GCCA based on a (bi-)linear generative model that naturally captures its essence. It is shown that from a linear algebra point of view, GCCA is tantamount to subspace intersection; and conditions under which the common subspace of the different views is identifiable are provided. A novel GCCA algorithm is proposed based on subspace intersection, which scales up to handle large GCCA tasks. Synthetic as well as real data experiments are provided to showcase the effectiveness of the proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes