Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled Markov Chains
This addresses a key bottleneck in VAE training for machine learning practitioners, offering a more accurate method, though it is incremental relative to existing importance sampling approaches.
The paper tackles the challenge of maximum likelihood training for variational auto-encoders (VAEs) by developing unbiased gradient estimators using coupled Markov chains, resulting in improved predictive performance as shown experimentally.
The variational auto-encoder (VAE) is a deep latent variable model that has two neural networks in an autoencoder-like architecture; one of them parameterizes the model's likelihood. Fitting its parameters via maximum likelihood (ML) is challenging since the computation of the marginal likelihood involves an intractable integral over the latent space; thus the VAE is trained instead by maximizing a variational lower bound. Here, we develop a ML training scheme for VAEs by introducing unbiased estimators of the log-likelihood gradient. We obtain the estimators by augmenting the latent space with a set of importance samples, similarly to the importance weighted auto-encoder (IWAE), and then constructing a Markov chain Monte Carlo coupling procedure on this augmented space. We provide the conditions under which the estimators can be computed in finite time and with finite variance. We show experimentally that VAEs fitted with unbiased estimators exhibit better predictive performance.