LGSTOct 2, 2022

Improved Stein Variational Gradient Descent with Importance Weights

arXiv:2210.00462v33 citationsh-index: 67
Originality Incremental advance
AI Analysis

This is an incremental improvement for researchers and practitioners using sampling algorithms in machine learning, as it modifies an existing method to potentially enhance efficiency in specific scenarios.

The authors tackled the problem of slow convergence in Stein Variational Gradient Descent (SVGD) by introducing importance weights, resulting in a new method called β-SVGD that shows weaker dependence on initial conditions in convergence time compared to standard SVGD.

Stein Variational Gradient Descent (SVGD) is a popular sampling algorithm used in various machine learning tasks. It is well known that SVGD arises from a discretization of the kernelized gradient flow of the Kullback-Leibler divergence $D_{KL}\left(\cdot\midπ\right)$, where $π$ is the target distribution. In this work, we propose to enhance SVGD via the introduction of importance weights, which leads to a new method for which we coin the name $β$-SVGD. In the continuous time and infinite particles regime, the time for this flow to converge to the equilibrium distribution $π$, quantified by the Stein Fisher information, depends on $ρ_0$ and $π$ very weakly. This is very different from the kernelized gradient flow of Kullback-Leibler divergence, whose time complexity depends on $D_{KL}\left(ρ_0\midπ\right)$. Under certain assumptions, we provide a descent lemma for the population limit $β$-SVGD, which covers the descent lemma for the population limit SVGD when $β\to 0$. We also illustrate the advantages of $β$-SVGD over SVGD by experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes