ML LG PR COMay 4, 2018

Sharp convergence rates for Langevin dynamics in the nonconvex setting

Xiang Cheng, Niladri S. Chatterji, Yasin Abbasi-Yadkori, Peter L. Bartlett, Michael I. Jordan

arXiv:1805.01648v430.0179 citations

Originality Highly original

AI Analysis

This provides theoretical guarantees for sampling in nonconvex settings, which is incremental but important for applications like Bayesian inference and machine learning.

The paper tackles the problem of sampling from nonconvex distributions using Langevin dynamics, establishing iteration complexities of $ ilde{\mathcal{O}}\left(e^{cLR^2}d/ε^2 ight)$ for overdamped and $ ilde{\mathcal{O}}\left(e^{cLR^2}\sqrt{d}/ε ight)$ for underdamped methods, which are polynomial in dimension and accuracy but exponential in a non-log-concavity measure.

We study the problem of sampling from a distribution $p^*(x) \propto \exp\left(-U(x)\right)$, where the function $U$ is $L$-smooth everywhere and $m$-strongly convex outside a ball of radius $R$, but potentially nonconvex inside this ball. We study both overdamped and underdamped Langevin MCMC and establish upper bounds on the number of steps required to obtain a sample from a distribution that is within $ε$ of $p^*$ in $1$-Wasserstein distance. For the first-order method (overdamped Langevin MCMC), the iteration complexity is $\tilde{\mathcal{O}}\left(e^{cLR^2}d/ε^2\right)$, where $d$ is the dimension of the underlying space. For the second-order method (underdamped Langevin MCMC), the iteration complexity is $\tilde{\mathcal{O}}\left(e^{cLR^2}\sqrt{d}/ε\right)$ for an explicit positive constant $c$. Surprisingly, the iteration complexity for both these algorithms is only polynomial in the dimension $d$ and the target accuracy $ε$. It is exponential, however, in the problem parameter $LR^2$, which is a measure of non-log-concavity of the target distribution.

View on arXiv PDF

Similar