MLLGMay 20, 2021

Kernel Stein Discrepancy Descent

arXiv:2105.09994v168 citations
Originality Incremental advance
AI Analysis

This addresses sampling challenges in machine learning and statistics, offering a novel particle-based approach, though it is incremental with identified limitations.

The paper tackles the problem of sampling from a target probability distribution known up to a normalization constant by proposing KSD Descent, a deterministic score-based method using particles, which leverages robust optimization schemes like L-BFGS but can get stuck in spurious local minima.

Among dissimilarities between probability distributions, the Kernel Stein Discrepancy (KSD) has received much interest recently. We investigate the properties of its Wasserstein gradient flow to approximate a target probability distribution $π$ on $\mathbb{R}^d$, known up to a normalization constant. This leads to a straightforwardly implementable, deterministic score-based method to sample from $π$, named KSD Descent, which uses a set of particles to approximate $π$. Remarkably, owing to a tractable loss function, KSD Descent can leverage robust parameter-free optimization schemes such as L-BFGS; this contrasts with other popular particle-based schemes such as the Stein Variational Gradient Descent algorithm. We study the convergence properties of KSD Descent and demonstrate its practical relevance. However, we also highlight failure cases by showing that the algorithm can get stuck in spurious local minima.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes