MLMay 22, 2017

From optimal transport to generative modeling: the VEGAN cookbook

Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Carl-Johann Simon-Gabriel, Bernhard Schoelkopf

arXiv:1705.07642v125.5154 citations

Originality Incremental advance

AI Analysis

This work offers theoretical justification for existing methods in generative modeling, which is incremental but clarifies foundational concepts for researchers in machine learning.

The paper tackles unsupervised generative modeling by framing it as an optimal transport problem between data and model distributions, showing that a penalized version coincides with adversarial auto-encoders and provides theoretical insights into variational auto-encoders and Wasserstein GANs.

We study unsupervised generative modeling in terms of the optimal transport (OT) problem between true (but unknown) data distribution $P_X$ and the latent variable model distribution $P_G$. We show that the OT problem can be equivalently written in terms of probabilistic encoders, which are constrained to match the posterior and prior distributions over the latent space. When relaxed, this constrained optimization problem leads to a penalized optimal transport (POT) objective, which can be efficiently minimized using stochastic gradient descent by sampling from $P_X$ and $P_G$. We show that POT for the 2-Wasserstein distance coincides with the objective heuristically employed in adversarial auto-encoders (AAE) (Makhzani et al., 2016), which provides the first theoretical justification for AAEs known to the authors. We also compare POT to other popular techniques like variational auto-encoders (VAE) (Kingma and Welling, 2014). Our theoretical results include (a) a better understanding of the commonly observed blurriness of images generated by VAEs, and (b) establishing duality between Wasserstein GAN (Arjovsky and Bottou, 2017) and POT for the 1-Wasserstein distance.

View on arXiv PDF

Similar