ML COJan 8, 2018

Log-concave sampling: Metropolis-Hastings algorithms are fast

Raaz Dwivedi, Yuansi Chen, Martin J. Wainwright, Bin Yu

arXiv:1801.02309v429.1291 citations

Originality Incremental advance

AI Analysis

This provides faster theoretical guarantees for sampling in machine learning and statistics, though it is incremental as it builds on existing Metropolis-Hastings methods.

The paper tackles the problem of sampling from strongly log-concave densities in high dimensions, proving that the Metropolis-adjusted Langevin algorithm (MALA) achieves an exponentially improved mixing time bound of O(κd log(1/δ)) compared to O(κ²d/δ²) for the unadjusted Langevin algorithm (ULA).

We consider the problem of sampling from a strongly log-concave density in $\mathbb{R}^d$, and prove a non-asymptotic upper bound on the mixing time of the Metropolis-adjusted Langevin algorithm (MALA). The method draws samples by simulating a Markov chain obtained from the discretization of an appropriate Langevin diffusion, combined with an accept-reject step. Relative to known guarantees for the unadjusted Langevin algorithm (ULA), our bounds show that the use of an accept-reject step in MALA leads to an exponentially improved dependence on the error-tolerance. Concretely, in order to obtain samples with TV error at most $δ$ for a density with condition number $κ$, we show that MALA requires $\mathcal{O} \big(κd \log(1/δ) \big)$ steps, as compared to the $\mathcal{O} \big(κ^2 d/δ^2 \big)$ steps established in past work on ULA. We also demonstrate the gains of MALA over ULA for weakly log-concave densities. Furthermore, we derive mixing time bounds for the Metropolized random walk (MRW) and obtain $\mathcal{O}(κ)$ mixing time slower than MALA. We provide numerical examples that support our theoretical findings, and demonstrate the benefits of Metropolis-Hastings adjustment for Langevin-type sampling algorithms.

View on arXiv PDF

Similar