LGMLNov 8, 2016

Variational Lossy Autoencoder

arXiv:1611.02731v2706 citations
Originality Incremental advance
AI Analysis

This work addresses representation learning for downstream tasks like classification by enabling control over what information is retained in latent codes, though it is incremental as it builds on existing VAE and autoregressive methods.

The paper tackles the problem of learning global representations that discard irrelevant details like texture in images by combining Variational Autoencoders with neural autoregressive models, achieving new state-of-the-art results on density estimation tasks for MNIST, OMNIGLOT, and Caltech-101 Silhouettes.

Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and , by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only "autoencodes" data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution $p(z)$ and decoding distribution $p(x|z)$, we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes