LG CV SP MLJun 9, 2021

I Don't Need u: Identifiable Non-Linear ICA Without Side Information

arXiv:2106.05238v417.229 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of reliable latent representation recovery in machine learning, offering a more practical approach for applications where side information is unavailable, though it is incremental in building on prior identifiability research.

The paper tackles the problem of algorithmic stability in unsupervised non-linear representation learning, finding that deep generative models with latent clustering achieve empirical identifiability comparable to models requiring side information, without needing auxiliary labels.

In this paper, we investigate the algorithmic stability of unsupervised representation learning with deep generative models, as a function of repeated re-training on the same input data. Algorithms for learning low dimensional linear representations -- for example principal components analysis (PCA), or linear independent components analysis (ICA) -- come with guarantees that they will always reveal the same latent representations (perhaps up to an arbitrary rotation or permutation). Unfortunately, for non-linear representation learning, such as in a variational auto-encoder (VAE) model trained by stochastic gradient descent, we have no such guarantees. Recent work on identifiability in non-linear ICA have introduced a family of deep generative models that have identifiable latent representations, achieved by conditioning on side information (e.g. informative labels). We empirically evaluate the stability of these models under repeated re-estimation of parameters, and compare them to both standard VAEs and deep generative models which learn to cluster in their latent space. Surprisingly, we discover side information is not necessary for algorithmic stability: using standard quantitative measures of identifiability, we find deep generative models with latent clusterings are empirically identifiable to the same degree as models which rely on auxiliary labels. We relate these results to the possibility of identifiable non-linear ICA.

View on arXiv PDF

Similar