Practical bounds on the error of Bayesian posterior approximations: A nonasymptotic approach
This work addresses the need for reliable error bounds in scalable Bayesian inference, which is crucial for practitioners using methods like variational inference, but it is incremental as it builds on existing distance metrics.
The authors tackled the problem of evaluating the accuracy of scalable Bayesian inference algorithms, which lack finite-sample error bounds, by developing a new approach to bound the error of posterior mean and uncertainty estimates using a tractable Fisher distance. They demonstrated this method by deriving bounds for the Laplace approximation and Hilbert coresets, showing its applicability to various inference techniques.
Bayesian inference typically requires the computation of an approximation to the posterior distribution. An important requirement for an approximate Bayesian inference algorithm is to output high-accuracy posterior mean and uncertainty estimates. Classical Monte Carlo methods, particularly Markov Chain Monte Carlo, remain the gold standard for approximate Bayesian inference because they have a robust finite-sample theory and reliable convergence diagnostics. However, alternative methods, which are more scalable or apply to problems where Markov Chain Monte Carlo cannot be used, lack the same finite-data approximation theory and tools for evaluating their accuracy. In this work, we develop a flexible new approach to bounding the error of mean and uncertainty estimates of scalable inference algorithms. Our strategy is to control the estimation errors in terms of Wasserstein distance, then bound the Wasserstein distance via a generalized notion of Fisher distance. Unlike computing the Wasserstein distance, which requires access to the normalized posterior distribution, the Fisher distance is tractable to compute because it requires access only to the gradient of the log posterior density. We demonstrate the usefulness of our Fisher distance approach by deriving bounds on the Wasserstein error of the Laplace approximation and Hilbert coresets. We anticipate that our approach will be applicable to many other approximate inference methods such as the integrated Laplace approximation, variational inference, and approximate Bayesian computation