Geometric Convergence Analysis of Variational Inference via Bregman Divergences
This work addresses the problem of rigorous convergence analysis for VI, which is crucial for scalable Bayesian inference, though it appears incremental as it builds on existing geometric and divergence-based methods.
The paper tackles the challenge of analyzing convergence in Variational Inference (VI) by developing a theoretical framework that expresses the negative Evidence Lower Bound as a Bregman divergence, enabling geometric analysis of the optimization landscape. It proves non-asymptotic convergence rates for gradient descent algorithms under this framework.
Variational Inference (VI) provides a scalable framework for Bayesian inference by optimizing the Evidence Lower Bound (ELBO), but convergence analysis remains challenging due to the objective's non-convexity and non-smoothness in Euclidean space. We establish a novel theoretical framework for analyzing VI convergence by exploiting the exponential family structure of distributions. We express negative ELBO as a Bregman divergence with respect to the log-partition function, enabling a geometric analysis of the optimization landscape. We show that this Bregman representation admits a weak monotonicity property that, while weaker than convexity, provides sufficient structure for rigorous convergence analysis. By deriving bounds on the objective function along rays in parameter space, we establish properties governed by the spectral characteristics of the Fisher information matrix. Under this geometric framework, we prove non-asymptotic convergence rates for gradient descent algorithms with both constant and diminishing step sizes.