LGOCMay 8

SGD for Variational Inference: Tackling Unbounded Variance via Preconditioning and Dynamic Batching

arXiv:2605.0753140.3
AI Analysis

For practitioners of variational inference, this work provides theoretical justification and practical guidance for using dynamic batching and preconditioning to ensure convergence, addressing a known bottleneck in BBVI.

The paper addresses unbounded variance in stochastic gradients for Black-Box Variational Inference, proving convergence guarantees for Minibatch Projected SGD with dynamic batching and preconditioning under the Blum-Gladyshev condition, and demonstrating efficacy on modern inference tasks.

Black-Box Variational Inference (BBVI) typically relies on Stochastic Gradient Descent (SGD) to optimize the Evidence Lower Bound (ELBO). However, the stochastic gradients in BBVI inherently exhibit unbounded variance, violating standard assumptions and instead satisfying the weaker Blum-Gladyshev (BG) condition, where variance grows quadratically with distance from the optimum. In this paper, we bridge the gap between stochastic optimization theory and the practical instances of BBVI. Focusing on the broad elliptic location-scale family of parameterized distributions, we offer two main contributions. First, we prove the existence of an ELBO solution, a foundational property usually assumed a priori in the literature. Second, we establish comprehensive convergence guarantees spanning finite-time and asymptotic regimes for Minibatch Projected SGD (PSGD) equipped with dynamic batching and preconditioning under the BG condition. Our theoretical framework demonstrates that dynamic batching combined with preconditioning systematically enables rigorous guarantees even in complex settings. We illustrate our theoretical findings with numerical results, highlighting the efficacy of our approach for modern inference tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes