Learning Deep-Latent Hierarchies by Stacking Wasserstein Autoencoders
This addresses the challenge of leveraging deep-latent hierarchies in unsupervised density-based models for researchers in generative modeling, offering an incremental improvement over prior Wasserstein Autoencoder methods.
The paper tackles the problem of training deep-latent hierarchical models by proposing a novel approach based on Optimal Transport, which avoids latent variable collapse and yields qualitatively better sample generations and more interpretable latent representations compared to existing methods.
Probabilistic models with hierarchical-latent-variable structures provide state-of-the-art results amongst non-autoregressive, unsupervised density-based models. However, the most common approach to training such models based on Variational Autoencoders (VAEs) often fails to leverage deep-latent hierarchies; successful approaches require complex inference and optimisation schemes. Optimal Transport is an alternative, non-likelihood-based framework for training generative models with appealing theoretical properties, in principle allowing easier training convergence between distributions. In this work we propose a novel approach to training models with deep-latent hierarchies based on Optimal Transport, without the need for highly bespoke models and inference networks. We show that our method enables the generative model to fully leverage its deep-latent hierarchy, avoiding the well known "latent variable collapse" issue of VAEs; therefore, providing qualitatively better sample generations as well as more interpretable latent representation than the original Wasserstein Autoencoder with Maximum Mean Discrepancy divergence.