Provably Scalable Black-Box Variational Inference with Structured Variational Families
This addresses scalability issues in Bayesian inference for practitioners dealing with large datasets, though it is incremental as it builds on existing variational methods.
The paper tackles the poor scalability of full-rank covariance approximations in black-box variational inference (BBVI) for hierarchical Bayesian models, proving that structured variational families achieve an iteration complexity of O(N) instead of O(N^2), which is empirically verified on large-scale models.
Variational families with full-rank covariance approximations are known not to work well in black-box variational inference (BBVI), both empirically and theoretically. In fact, recent computational complexity results for BBVI have established that full-rank variational families scale poorly with the dimensionality of the problem compared to e.g. mean-field families. This is particularly critical to hierarchical Bayesian models with local variables; their dimensionality increases with the size of the datasets. Consequently, one gets an iteration complexity with an explicit $\mathcal{O}(N^2)$ dependence on the dataset size $N$. In this paper, we explore a theoretical middle ground between mean-field variational families and full-rank families: structured variational families. We rigorously prove that certain scale matrix structures can achieve a better iteration complexity of $\mathcal{O}\left(N\right)$, implying better scaling with respect to $N$. We empirically verify our theoretical results on large-scale hierarchical models.