LGFeb 27

Provable Subspace Identification of Nonlinear Multi-view CCA

Zhiwei Han, Stefan Matthes, Hao Shen
arXiv:2602.23785v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of identifying shared latent structures in nonlinear multi-view data, which is incremental as it builds on existing CCA methods by providing theoretical guarantees.

The paper tackles the identifiability of nonlinear Canonical Correlation Analysis (CCA) in multi-view setups by reframing it as a subspace identification problem, proving that multi-view CCA recovers correlated signal subspaces up to orthogonal ambiguity and isolates jointly correlated subspaces for N ≥ 3 views, with experiments validating the theory.

We investigate the identifiability of nonlinear Canonical Correlation Analysis (CCA) in a multi-view setup, where each view is generated by an unknown nonlinear map applied to a linear mixture of shared latents and view-private noise. Rather than attempting exact unmixing, a problem proven to be ill-posed, we instead reframe multi-view CCA as a basis-invariant subspace identification problem. We prove that, under suitable latent priors and spectral separation conditions, multi-view CCA recovers the pairwise correlated signal subspaces up to view-wise orthogonal ambiguity. For $N \geq 3$ views, the objective provably isolates the jointly correlated subspaces shared across all views while eliminating view-private variations. We further establish finite-sample consistency guarantees by translating the concentration of empirical cross-covariances into explicit subspace error bounds via spectral perturbation theory. Experiments on synthetic and rendered image datasets validate our theoretical findings and confirm the necessity of the assumed conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes