LGNov 2, 2017

Neural Discrete Representation Learning

arXiv:1711.00937v27329 citations
Originality Highly original
AI Analysis

It addresses the problem of learning useful discrete representations without supervision for generative modeling in machine learning, offering a novel approach to improve on existing VAE limitations.

The paper tackles the challenge of unsupervised representation learning by proposing the VQ-VAE model, which learns discrete latent codes to avoid posterior collapse and generates high-quality images, videos, and speech.

Learning useful representations without supervision remains a key challenge in machine learning. In this paper, we propose a simple yet powerful generative model that learns such discrete representations. Our model, the Vector Quantised-Variational AutoEncoder (VQ-VAE), differs from VAEs in two key ways: the encoder network outputs discrete, rather than continuous, codes; and the prior is learnt rather than static. In order to learn a discrete latent representation, we incorporate ideas from vector quantisation (VQ). Using the VQ method allows the model to circumvent issues of "posterior collapse" -- where the latents are ignored when they are paired with a powerful autoregressive decoder -- typically observed in the VAE framework. Pairing these representations with an autoregressive prior, the model can generate high quality images, videos, and speech as well as doing high quality speaker conversion and unsupervised learning of phonemes, providing further evidence of the utility of the learnt representations.

Code Implementations50 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes