MLLGSTOct 20, 2020

VarGrad: A Low-Variance Gradient Estimator for Variational Inference

arXiv:2010.10436v283 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck in variational inference for machine learning practitioners by reducing gradient variance, though it is incremental as it builds on existing score function methods.

The paper tackles the problem of high variance in gradient estimators for variational inference by proposing VarGrad, a low-variance estimator based on a log-variance loss and leave-one-out control variates, showing theoretical lower variance in certain settings and empirical improvements in a discrete VAE.

We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the $\textit{log-variance loss}$. Under certain conditions, the gradient of the log-variance loss equals the gradient of the (negative) ELBO. We show theoretically that this gradient estimator, which we call $\textit{VarGrad}$ due to its connection to the log-variance loss, exhibits lower variance than the score function method in certain settings, and that the leave-one-out control variate coefficients are close to the optimal ones. We empirically demonstrate that VarGrad offers a favourable variance versus computation trade-off compared to other state-of-the-art estimators on a discrete VAE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes