LGOCMLFeb 22, 2018

On the Convergence and Robustness of Training GANs with Regularized Optimal Transport

arXiv:1802.08249v2150 citations
AI Analysis

This work addresses the problem of unstable and computationally intensive GAN training for machine learning practitioners, offering an incremental improvement with a more efficient and theoretically grounded method.

The authors tackled the computational challenges of training GANs using Wasserstein distance by proposing a smoothed formulation based on regularized optimal transport, which allows efficient gradient computation and first-order optimization, leading to theoretical convergence guarantees and image generation comparable to state-of-the-art on MNIST and CIFAR-10.

Generative Adversarial Networks (GANs) are one of the most practical methods for learning data distributions. A popular GAN formulation is based on the use of Wasserstein distance as a metric between probability distributions. Unfortunately, minimizing the Wasserstein distance between the data distribution and the generative model distribution is a computationally challenging problem as its objective is non-convex, non-smooth, and even hard to compute. In this work, we show that obtaining gradient information of the smoothed Wasserstein GAN formulation, which is based on regularized Optimal Transport (OT), is computationally effortless and hence one can apply first order optimization methods to minimize this objective. Consequently, we establish theoretical convergence guarantee to stationarity for a proposed class of GAN optimization algorithms. Unlike the original non-smooth formulation, our algorithm only requires solving the discriminator to approximate optimality. We apply our method to learning MNIST digits as well as CIFAR-10images. Our experiments show that our method is computationally efficient and generates images comparable to the state of the art algorithms given the same architecture and computational power.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes