Image Generation and Editing with Variational Info Generative AdversarialNetworks
This work addresses image generation and editing for computer vision applications, but it appears incremental as it builds on existing GAN and VAE models.
The paper tackles the problem of generating and editing images by proposing Variational InfoGAN (ViGAN), which aims to generate new images conditioned on visual descriptions and modify images by varying descriptions while fixing latent representations, demonstrating its ability on LFW, celebA, and modified MNIST datasets.
Recently there has been an enormous interest in generative models for images in deep learning. In pursuit of this, Generative Adversarial Networks (GAN) and Variational Auto-Encoder (VAE) have surfaced as two most prominent and popular models. While VAEs tend to produce excellent reconstructions but blurry samples, GANs generate sharp but slightly distorted images. In this paper we propose a new model called Variational InfoGAN (ViGAN). Our aim is two fold: (i) To generated new images conditioned on visual descriptions, and (ii) modify the image, by fixing the latent representation of image and varying the visual description. We evaluate our model on Labeled Faces in the Wild (LFW), celebA and a modified version of MNIST datasets and demonstrate the ability of our model to generate new images as well as to modify a given image by changing attributes.