Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference
This work addresses the scalability issue in variational inference for high-dimensional Bayesian models, offering a theoretical improvement that is incremental but practically relevant for machine learning practitioners.
The paper tackles the problem of high-dimensional optimization in black-box variational inference by proving that mean-field location-scale families achieve a convergence rate with only logarithmic dimension dependence for strongly log-concave and log-smooth targets, specifically O(log d) iterations to reach ε-close to the optimum, compared to O(d) for full-rank families.
We prove that, given a mean-field location-scale variational family, black-box variational inference (BBVI) with the reparametrization gradient converges at a rate that is nearly independent of explicit dimension dependence. Specifically, for a $d$-dimensional strongly log-concave and log-smooth target, the number of iterations for BBVI with a sub-Gaussian family to obtain a solution $ε$-close to the global optimum has a dimension dependence of $\mathrm{O}(\log d)$. This is a significant improvement over the $\mathrm{O}(d)$ dependence of full-rank location-scale families. For heavy-tailed families, we prove a weaker $\mathrm{O}(d^{2/k})$ dependence, where $k$ is the number of finite moments of the family. Additionally, if the Hessian of the target log-density is constant, the complexity is free of any explicit dimension dependence. We also prove that our bound on the gradient variance, which is key to our result, cannot be improved using only spectral bounds on the Hessian of the target log-density.