Variational Inference with Mixtures of Isotropic Gaussians
This work addresses the challenge of scalable Bayesian inference for practitioners, though it is incremental as it builds on existing variational methods with a specific parametric family.
The paper tackles the problem of approximating multimodal Bayesian posteriors efficiently by using mixtures of isotropic Gaussians in variational inference, resulting in algorithms that balance accuracy with memory and computational efficiency.
Variational inference (VI) is a popular approach in Bayesian inference, that looks for the best approximation of the posterior distribution within a parametric family, minimizing a loss that is typically the (reverse) Kullback-Leibler (KL) divergence. In this paper, we focus on the following parametric family: mixtures of isotropic Gaussians (i.e., with diagonal covariance matrices proportional to the identity) and uniform weights. We develop a variational framework and provide efficient algorithms suited for this family. In contrast with mixtures of Gaussian with generic covariance matrices, this choice presents a balance between accurate approximations of multimodal Bayesian posteriors, while being memory and computationally efficient. Our algorithms implement gradient descent on the location of the mixture components (the modes of the Gaussians), and either (an entropic) Mirror or Bures descent on their variance parameters. We illustrate the performance of our algorithms on numerical experiments.