MLMay 25, 2017

Convergence of Langevin MCMC in KL-divergence

arXiv:1705.09048v2213 citations
Originality Incremental advance
AI Analysis

This provides theoretical guarantees for sampling algorithms in machine learning and statistics, though it is incremental as it builds on known results with a simpler proof.

The paper establishes that discrete Langevin diffusion achieves KL-divergence convergence to a target distribution under smoothness and strong convexity assumptions, requiring O(d/ε) steps, and extends analysis to cases without strong convexity using a gradient flow framework.

Langevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish that when the target density $p^*$ is such that $\log p^*$ is $L$ smooth and $m$ strongly convex, discrete Langevin diffusion produces a distribution $p$ with $KL(p||p^*)\leq ε$ in $\tilde{O}(\frac{d}ε)$ steps, where $d$ is the dimension of the sample space. We also study the convergence rate when the strong-convexity assumption is absent. By considering the Langevin diffusion as a gradient flow in the space of probability distributions, we obtain an elegant analysis that applies to the stronger property of convergence in KL-divergence and gives a conceptually simpler proof of the best-known convergence results in weaker metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes