Non-linear Canonical Correlation Analysis: A Compressed Representation Approach
This work addresses the computational and performance limitations of non-linear CCA for multi-view data analysis, offering a novel framework with applications in representation learning, but it is incremental as it builds upon existing ACE and information-theoretic methods.
The authors tackled the problem of non-linear Canonical Correlation Analysis (CCA) by introducing a compressed representation framework (CRCCA) that extends the Alternating Conditional Expectation algorithm, achieving control over the trade-off between model flexibility and complexity while providing theoretical bounds and optimality conditions.
Canonical Correlation Analysis (CCA) is a linear representation learning method that seeks maximally correlated variables in multi-view data. Non-linear CCA extends this notion to a broader family of transformations, which are more powerful in many real-world applications. Given the joint probability, the Alternating Conditional Expectation (ACE) algorithm provides an optimal solution to the non-linear CCA problem. However, it suffers from limited performance and an increasing computational burden when only a finite number of samples is available. In this work we introduce an information-theoretic compressed representation framework for the non-linear CCA problem (CRCCA), which extends the classical ACE approach. Our suggested framework seeks compact representations of the data that allow a maximal level of correlation. This way we control the trade-off between the flexibility and the complexity of the model. CRCCA provides theoretical bounds and optimality conditions, as we establish fundamental connections to rate-distortion theory, the information bottleneck and remote source coding. In addition, it allows a soft dimensionality reduction, as the compression level is determined by the mutual information between the original noisy data and the extracted signals. Finally, we introduce a simple implementation of the CRCCA framework, based on lattice quantization.