Gromov-Wasserstein Autoencoders
This addresses a specific issue in representation learning for researchers, offering a more stable approach to incorporating meta-priors, though it appears incremental as it builds on existing VAE frameworks.
The paper tackles the problem of undesirable training changes in VAE-based models when incorporating meta-priors by proposing Gromov-Wasserstein Autoencoders (GWAE), which directly match latent and data distributions using the Gromov-Wasserstein metric, and empirical results show it works for disentanglement and clustering without altering the objective.
Variational Autoencoder (VAE)-based generative models offer flexible representation learning by incorporating meta-priors, general premises considered beneficial for downstream tasks. However, the incorporated meta-priors often involve ad-hoc model deviations from the original likelihood architecture, causing undesirable changes in their training. In this paper, we propose a novel representation learning method, Gromov-Wasserstein Autoencoders (GWAE), which directly matches the latent and data distributions using the variational autoencoding scheme. Instead of likelihood-based objectives, GWAE models minimize the Gromov-Wasserstein (GW) metric between the trainable prior and given data distributions. The GW metric measures the distance structure-oriented discrepancy between distributions even with different dimensionalities, which provides a direct measure between the latent and data spaces. By restricting the prior family, we can introduce meta-priors into the latent space without changing their objective. The empirical comparisons with VAE-based models show that GWAE models work in two prominent meta-priors, disentanglement and clustering, with their GW objective unchanged.