Interpolating between sampling and variational inference with infinite stochastic mixtures
This work addresses the trade-off between efficiency and accuracy in approximate inference for machine learning, offering a flexible family of methods that bridges two major approaches.
The authors tackled the complementary limitations of sampling and variational inference by developing a framework using stochastic mixtures, resulting in a method that provably reduces variance compared to sampling and reduces bias+variance compared to VI, with practical demonstrations on reference problems.
Sampling and Variational Inference (VI) are two large families of methods for approximate inference that have complementary strengths. Sampling methods excel at approximating arbitrary probability distributions, but can be inefficient. VI methods are efficient, but may misrepresent the true distribution. Here, we develop a general framework where approximations are stochastic mixtures of simple component distributions. Both sampling and VI can be seen as special cases: in sampling, each mixture component is a delta-function and is chosen stochastically, while in standard VI a single component is chosen to minimize divergence. We derive a practical method that interpolates between sampling and VI by solving an optimization problem over a mixing distribution. Intermediate inference methods then arise by varying a single parameter. Our method provably improves on sampling (reducing variance) and on VI (reducing bias+variance despite increasing variance). We demonstrate our method's bias/variance trade-off in practice on reference problems, and we compare outcomes to commonly used sampling and VI methods. This work takes a step towards a highly flexible yet simple family of inference methods that combines the complementary strengths of sampling and VI.