Disentangled Representation Learning with Wasserstein Total Correlation
This work addresses the challenge of metric sensitivity in disentangled representation learning for machine learning researchers, though it appears incremental as it modifies an existing approach.
The paper tackles the problem of learning disentangled representations in unsupervised learning by introducing Wasserstein total correlation as an alternative to KL divergence-based methods, showing comparable disentanglement performance with smaller reconstruction losses.
Unsupervised learning of disentangled representations involves uncovering of different factors of variations that contribute to the data generation process. Total correlation penalization has been a key component in recent methods towards disentanglement. However, Kullback-Leibler (KL) divergence-based total correlation is metric-agnostic and sensitive to data samples. In this paper, we introduce Wasserstein total correlation in both variational autoencoder and Wasserstein autoencoder settings to learn disentangled latent representations. A critic is adversarially trained along with the main objective to estimate the Wasserstein total correlation term. We discuss the benefits of using Wasserstein distance over KL divergence to measure independence and conduct quantitative and qualitative experiments on several data sets. Moreover, we introduce a new metric to measure disentanglement. We show that the proposed approach has comparable performances on disentanglement with smaller sacrifices in reconstruction abilities.