STDSLGAPMLFeb 20, 2023

Faster high-accuracy log-concave sampling via algorithmic warm starts

arXiv:2302.10249v157 citationsh-index: 17
Originality Highly original
AI Analysis

This solves an important open problem in computational statistics and machine learning by enabling faster sampling for applications like Bayesian inference, though it is incremental in building on existing warm start methods.

The paper tackles the problem of high-accuracy sampling from strongly log-concave distributions by improving the dimension dependence of the Metropolis-adjusted Langevin algorithm (MALA) from $ ilde{O}(d)$ to $ ilde{O}(d^{1/2})$ under warm starts, achieving state-of-the-art guarantees.

Understanding the complexity of sampling from a strongly log-concave and log-smooth distribution $π$ on $\mathbb{R}^d$ to high accuracy is a fundamental problem, both from a practical and theoretical standpoint. In practice, high-accuracy samplers such as the classical Metropolis-adjusted Langevin algorithm (MALA) remain the de facto gold standard; and in theory, via the proximal sampler reduction, it is understood that such samplers are key for sampling even beyond log-concavity (in particular, for distributions satisfying isoperimetric assumptions). In this work, we improve the dimension dependence of this sampling problem to $\tilde{O}(d^{1/2})$, whereas the previous best result for MALA was $\tilde{O}(d)$. This closes the long line of work on the complexity of MALA, and moreover leads to state-of-the-art guarantees for high-accuracy sampling under strong log-concavity and beyond (thanks to the aforementioned reduction). Our starting point is that the complexity of MALA improves to $\tilde{O}(d^{1/2})$, but only under a warm start (an initialization with constant Rényi divergence w.r.t. $π$). Previous algorithms took much longer to find a warm start than to use it, and closing this gap has remained an important open problem in the field. Our main technical contribution settles this problem by establishing the first $\tilde{O}(d^{1/2})$ Rényi mixing rates for the discretized underdamped Langevin diffusion. For this, we develop new differential-privacy-inspired techniques based on Rényi divergences with Orlicz--Wasserstein shifts, which allow us to sidestep longstanding challenges for proving fast convergence of hypocoercive differential equations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes