CVLGMLAug 26, 2019

PixelVAE++: Improved PixelVAE with Discrete Prior

arXiv:1908.09948v133 citations
AI Analysis

This work addresses the problem of generating high-quality images with informative latent variables for machine learning researchers, representing an incremental improvement over existing hybrid models.

The paper tackles the challenge of constructing powerful generative models for natural images by introducing PixelVAE++, which combines variational autoencoders and PixelCNN++ to capture both global and local structures, achieving state-of-the-art performance on MNIST, Omniglot, and CIFAR-10 datasets among latent variable models.

Constructing powerful generative models for natural images is a challenging task. PixelCNN models capture details and local information in images very well but have limited receptive field. Variational autoencoders with a factorial decoder can capture global information easily, but they often fail to reconstruct details faithfully. PixelVAE combines the best features of the two models and constructs a generative model that is able to learn local and global structures. Here we introduce PixelVAE++, a VAE with three types of latent variables and a PixelCNN++ for the decoder. We introduce a novel architecture that reuses a part of the decoder as an encoder. We achieve the state of the art performance on binary data sets such as MNIST and Omniglot and achieve the state of the art performance on CIFAR-10 among latent variable models while keeping the latent variables informative.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes