Noise Contrastive Variational Autoencoders
This addresses a fundamental issue in VAEs for researchers and practitioners, offering a simple fix to improve model performance, though it is incremental as it builds on existing VAE frameworks.
The paper tackles the posterior collapse problem in variational autoencoders (VAEs), where latent codes become independent of inputs, by proposing NC-VAE, which uses noise contrastive estimation to prevent this collapse and shows empirical benefits on image and text datasets.
We take steps towards understanding the "posterior collapse (PC)" difficulty in variational autoencoders (VAEs),~i.e. a degenerate optimum in which the latent codes become independent of their corresponding inputs. We rely on calculus of variations and theoretically explore a few popular VAE models, showing that PC always occurs for non-parametric encoders and decoders. Inspired by the popular noise contrastive estimation algorithm, we propose NC-VAE where the encoder discriminates between the latent codes of real data and of some artificially generated noise, in addition to encouraging good data reconstruction abilities. Theoretically, we prove that our model cannot reach PC and provide novel lower bounds. Our method is straightforward to implement and has the same run-time as vanilla VAE. Empirically, we showcase its benefits on popular image and text datasets.