LGMLFeb 25, 2019

Wasserstein-Wasserstein Auto-Encoders

arXiv:1902.09323v113 citations
AI Analysis

This addresses the problem of improving generative model training for researchers and practitioners by offering a more stable and efficient method, though it appears incremental as it builds on existing auto-encoder and optimal transport frameworks.

The paper tackles the blurriness of variational auto-encoders and instability of generative adversarial networks by proposing Wasserstein-Wasserstein auto-encoders, which minimize penalized optimal transport and leverage Gaussian assumptions for computational efficiency, resulting in better latent structures and higher FID scores on datasets like MNIST and CelebA.

To address the challenges in learning deep generative models (e.g.,the blurriness of variational auto-encoder and the instability of training generative adversarial networks, we propose a novel deep generative model, named Wasserstein-Wasserstein auto-encoders (WWAE). We formulate WWAE as minimization of the penalized optimal transport between the target distribution and the generated distribution. By noticing that both the prior $P_Z$ and the aggregated posterior $Q_Z$ of the latent code Z can be well captured by Gaussians, the proposed WWAE utilizes the closed-form of the squared Wasserstein-2 distance for two Gaussians in the optimization process. As a result, WWAE does not suffer from the sampling burden and it is computationally efficient by leveraging the reparameterization trick. Numerical results evaluated on multiple benchmark datasets including MNIST, fashion- MNIST and CelebA show that WWAE learns better latent structures than VAEs and generates samples of better visual quality and higher FID scores than VAEs and GANs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes