ML LG STOct 20, 2020

VarGrad: A Low-Variance Gradient Estimator for Variational Inference

Lorenz Richter, Ayman Boustati, Nikolas Nüsken, Francisco J. R. Ruiz, Ömer Deniz Akyildiz

arXiv:2010.10436v223.083 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a bottleneck in variational inference for machine learning practitioners by reducing gradient variance, though it is incremental as it builds on existing score function methods.

The paper tackles the problem of high variance in gradient estimators for variational inference by proposing VarGrad, a low-variance estimator based on a log-variance loss and leave-one-out control variates, showing theoretical lower variance in certain settings and empirical improvements in a discrete VAE.

We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the $\textit{log-variance loss}$. Under certain conditions, the gradient of the log-variance loss equals the gradient of the (negative) ELBO. We show theoretically that this gradient estimator, which we call $\textit{VarGrad}$ due to its connection to the log-variance loss, exhibits lower variance than the score function method in certain settings, and that the leave-one-out control variate coefficients are close to the optimal ones. We empirically demonstrate that VarGrad offers a favourable variance versus computation trade-off compared to other state-of-the-art estimators on a discrete VAE.

View on arXiv PDF Code

Similar