ML LGJul 22, 2022

Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection

Kush Bhatia, Nikki Lijing Kuang, Yi-An Ma, Yixin Wang

arXiv:2207.11208v213.88 citationsh-index: 27

Originality Incremental advance

AI Analysis

This work addresses the problem of balancing accuracy and speed in large-scale Bayesian inference for researchers and practitioners, but it is incremental as it builds on existing variational inference methods with a specific theoretical case study.

The paper tackles the trade-offs between statistical accuracy and computational efficiency in variational inference, focusing on Gaussian inferential models with diagonal plus low-rank precision matrices. It theoretically characterizes errors in Bayesian posterior inference and frequentist uncertainty quantification, showing that lower-rank models reduce computational error but increase statistical approximation error, and for small datasets, full-rank models are not necessary for optimal estimation.

Variational inference has recently emerged as a popular alternative to the classical Markov chain Monte Carlo (MCMC) in large-scale Bayesian inference. The core idea is to trade statistical accuracy for computational efficiency. In this work, we study these statistical and computational trade-offs in variational inference via a case study in inferential model selection. Focusing on Gaussian inferential models (or variational approximating families) with diagonal plus low-rank precision matrices, we initiate a theoretical study of the trade-offs in two aspects, Bayesian posterior inference error and frequentist uncertainty quantification error. From the Bayesian posterior inference perspective, we characterize the error of the variational posterior relative to the exact posterior. We prove that, given a fixed computation budget, a lower-rank inferential model produces variational posteriors with a higher statistical approximation error, but a lower computational error; it reduces variance in stochastic optimization and, in turn, accelerates convergence. From the frequentist uncertainty quantification perspective, we consider the precision matrix of the variational posterior as an uncertainty estimate, which involves an additional statistical error originating from the sampling uncertainty of the data. As a consequence, for small datasets, the inferential model need not be full-rank to achieve optimal estimation error (even with unlimited computation budget).

View on arXiv PDF

Similar