Stein Variational Gradient Descent as Gradient Flow
This foundational theoretical work addresses the lack of rigorous convergence analysis for a widely used deterministic sampling algorithm in machine learning.
This paper provides the first theoretical analysis of Stein Variational Gradient Descent (SVGD), showing its asymptotic behavior is captured by a gradient flow of the KL divergence under a Stein-induced metric structure, and proves new results on Stein operator properties including a novel proof of Stein discrepancy distinguishability under weak conditions.
Stein variational gradient descent (SVGD) is a deterministic sampling algorithm that iteratively transports a set of particles to approximate given distributions, based on an efficient gradient-based update that guarantees to optimally decrease the KL divergence within a function space. This paper develops the first theoretical analysis on SVGD, discussing its weak convergence properties and showing that its asymptotic behavior is captured by a gradient flow of the KL divergence functional under a new metric structure induced by Stein operator. We also provide a number of results on Stein operator and Stein's identity using the notion of weak derivative, including a new proof of the distinguishability of Stein discrepancy under weak conditions.