Particle-based Variational Inference with Generalized Wasserstein Gradient Flow
This work addresses a bottleneck in variational inference methods for machine learning practitioners, offering an incremental improvement over existing particle-based approaches.
The paper tackles the challenge of designing kernels in particle-based variational inference by proposing a generalized Wasserstein gradient descent (GWG) framework, which uses a broader class of regularizers and shows strong convergence guarantees, with experiments demonstrating effectiveness and efficiency on simulated and real data.
Particle-based variational inference methods (ParVIs) such as Stein variational gradient descent (SVGD) update the particles based on the kernelized Wasserstein gradient flow for the Kullback-Leibler (KL) divergence. However, the design of kernels is often non-trivial and can be restrictive for the flexibility of the method. Recent works show that functional gradient flow approximations with quadratic form regularization terms can improve performance. In this paper, we propose a ParVI framework, called generalized Wasserstein gradient descent (GWG), based on a generalized Wasserstein gradient flow of the KL divergence, which can be viewed as a functional gradient method with a broader class of regularizers induced by convex functions. We show that GWG exhibits strong convergence guarantees. We also provide an adaptive version that automatically chooses Wasserstein metric to accelerate convergence. In experiments, we demonstrate the effectiveness and efficiency of the proposed framework on both simulated and real data problems.