ML LGAug 12, 2022

Function Classes for Identifiable Nonlinear Independent Component Analysis

Simon Buchholz, Michel Besserve, Bernhard Schölkopf

arXiv:2208.06406v127.358 citationsh-index: 169

Originality Incremental advance

AI Analysis

This work addresses the challenge of ensuring that unsupervised latent variable models reflect true underlying factors, which is crucial for generalization in machine learning tasks, though it is incremental as it builds on prior constraints like orthogonal coordinate transformations.

The paper tackles the problem of identifiability in nonlinear Independent Component Analysis by proving that a subclass of orthogonal coordinate transformations, specifically conformal maps, is identifiable and showing that these transformations prevent spurious solutions in generic settings.

Unsupervised learning of latent variable models (LVMs) is widely used to represent data in machine learning. When such models reflect the ground truth factors and the mechanisms mapping them to observations, there is reason to expect that they allow generalization in downstream tasks. It is however well known that such identifiability guaranties are typically not achievable without putting constraints on the model class. This is notably the case for nonlinear Independent Component Analysis, in which the LVM maps statistically independent variables to observations via a deterministic nonlinear function. Several families of spurious solutions fitting perfectly the data, but that do not correspond to the ground truth factors can be constructed in generic settings. However, recent work suggests that constraining the function class of such models may promote identifiability. Specifically, function classes with constraints on their partial derivatives, gathered in the Jacobian matrix, have been proposed, such as orthogonal coordinate transformations (OCT), which impose orthogonality of the Jacobian columns. In the present work, we prove that a subclass of these transformations, conformal maps, is identifiable and provide novel theoretical results suggesting that OCTs have properties that prevent families of spurious solutions to spoil identifiability in a generic setting.

View on arXiv PDF

Similar