MLLGJun 13, 2014

Smoothed Gradients for Stochastic Variational Inference

arXiv:1406.3650v229 citations
Originality Incremental advance
AI Analysis

This work addresses scalability and stability issues in Bayesian computation for large datasets, though it is incremental as it builds on existing SVI methods.

The paper tackles the problem of high variance in stochastic variational inference (SVI) by introducing a biased gradient method using a fixed-window moving average, which reduces variance and mean-squared error while maintaining computational efficiency. It demonstrates this on latent Dirichlet allocation with three large corpora, showing significant improvements.

Stochastic variational inference (SVI) lets us scale up Bayesian computation to massive data. It uses stochastic optimization to fit a variational distribution, following easy-to-compute noisy natural gradients. As with most traditional stochastic optimization methods, SVI takes precautions to use unbiased stochastic gradients whose expectations are equal to the true gradients. In this paper, we explore the idea of following biased stochastic gradients in SVI. Our method replaces the natural gradient with a similarly constructed vector that uses a fixed-window moving average of some of its previous terms. We will demonstrate the many advantages of this technique. First, its computational cost is the same as for SVI and storage requirements only multiply by a constant factor. Second, it enjoys significant variance reduction over the unbiased estimates, smaller bias than averaged gradients, and leads to smaller mean-squared error against the full gradient. We test our method on latent Dirichlet allocation with three large corpora.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes