Stein Variational Gaussian Processes
This addresses computational bottlenecks in Bayesian inference for practitioners dealing with complex, large-scale Gaussian process models, offering a more efficient and reliable method compared to existing approaches.
The paper tackles inference in Gaussian process models with non-Gaussian likelihoods and large datasets, where MCMC is slow and variational inference can fail due to multi-modal posteriors, and shows that Stein variational gradient descent provides a faster, non-parametric alternative with proven convergence and substantial performance gains on benchmarks and a large-scale air quality example with 550,134 observations.
We show how to use Stein variational gradient descent (SVGD) to carry out inference in Gaussian process (GP) models with non-Gaussian likelihoods and large data volumes. Markov chain Monte Carlo (MCMC) is extremely computationally intensive for these situations, but the parametric assumptions required for efficient variational inference (VI) result in incorrect inference when they encounter the multi-modal posterior distributions that are common for such models. SVGD provides a non-parametric alternative to variational inference which is substantially faster than MCMC. We prove that for GP models with Lipschitz gradients the SVGD algorithm monotonically decreases the Kullback-Leibler divergence from the sampling distribution to the true posterior. Our method is demonstrated on benchmark problems in both regression and classification, a multimodal posterior, and an air quality example with 550,134 spatiotemporal observations, showing substantial performance improvements over MCMC and VI.