DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors
This work addresses a specific bottleneck in training discrete VAEs for researchers in generative modeling, though it is incremental as it builds on existing Boltzmann prior methods.
The authors tackled the problem of training discrete variational autoencoders with Boltzmann priors by proposing two relaxation methods that enable the use of tighter importance-weighted bounds, resulting in improved performance on MNIST and OMNIGLOT datasets compared to previous methods.
Boltzmann machines are powerful distributions that have been shown to be an effective prior over binary latent variables in variational autoencoders (VAEs). However, previous methods for training discrete VAEs have used the evidence lower bound and not the tighter importance-weighted bound. We propose two approaches for relaxing Boltzmann machines to continuous distributions that permit training with importance-weighted bounds. These relaxations are based on generalized overlapping transformations and the Gaussian integral trick. Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors. An implementation which reproduces these results is available at https://github.com/QuadrantAI/dvae .