The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA
This addresses the challenge of synthesizing disparate multimodal data into a unified representation, offering a theoretical foundation for applications in fields like neuroscience or sensor fusion, though it is incremental as it builds on existing nonlinear ICA work.
The paper tackles the problem of recovering a common latent source with independent components from multiple nonlinear noisy views, proving that identifiability is possible when views are considered jointly, unlike in single-view nonlinear ICA where it is impossible.
We consider the problem of recovering a common latent source with independent components from multiple views. This applies to settings in which a variable is measured with multiple experimental modalities, and where the goal is to synthesize the disparate measurements into a single unified representation. We consider the case that the observed views are a nonlinear mixing of component-wise corruptions of the sources. When the views are considered separately, this reduces to nonlinear Independent Component Analysis (ICA) for which it is provably impossible to undo the mixing. We present novel identifiability proofs that this is possible when the multiple views are considered jointly, showing that the mixing can theoretically be undone using function approximators such as deep neural networks. In contrast to known identifiability results for nonlinear ICA, we prove that independent latent sources with arbitrary mixing can be recovered as long as multiple, sufficiently different noisy views are available.