Relevance Factor VAE: Learning and Identifying Disentangled Factors
This addresses a key bottleneck in unsupervised disentangled representation learning for machine learning applications, though it is an incremental improvement over prior VAE approaches.
The paper tackles the problem of existing VAE-based disentanglement methods failing to distinguish meaningful from nuisance latent factors, which degrades performance, and proposes Relevance-Factor-VAE, which uses relevance indicator variables to focus the total correlation loss on relevant factors, outperforming existing methods on benchmark datasets.
We propose a novel VAE-based deep auto-encoder model that can learn disentangled latent representations in a fully unsupervised manner, endowed with the ability to identify all meaningful sources of variation and their cardinality. Our model, dubbed Relevance-Factor-VAE, leverages the total correlation (TC) in the latent space to achieve the disentanglement goal, but also addresses the key issue of existing approaches which cannot distinguish between meaningful and nuisance factors of latent variation, often the source of considerable degradation in disentanglement performance. We tackle this issue by introducing the so-called relevance indicator variables that can be automatically learned from data, together with the VAE parameters. Our model effectively focuses the TC loss onto the relevant factors only by tolerating large prior KL divergences, a desideratum justified by our semi-parametric theoretical analysis. Using a suite of disentanglement metrics, including a newly proposed one, as well as qualitative evidence, we demonstrate that our model outperforms existing methods across several challenging benchmark datasets.