ML LG PR COAug 20, 2024

Convergence of Unadjusted Langevin in High Dimensions: Delocalization of Bias

Yifan Chen, Xiaoou Cheng, Jonathan Niles-Weed, Jonathan Weare

arXiv:2408.13115v29.25 citationsh-index: 7

Originality Incremental advance

AI Analysis

This addresses the computational bottleneck of sampling in high-dimensional machine learning and statistics, offering a more efficient approach for marginal distributions, though it is incremental as it builds on existing analyses with specific assumptions.

The paper tackles the poor scaling of the unadjusted Langevin algorithm in high dimensions by showing that convergence for a small number of variables can be much faster, requiring iterations proportional to K up to logarithmic terms in d, a phenomenon termed delocalization of bias. They prove this effect holds for Gaussian and strongly log-concave distributions with sparse interactions, using a novel W_{2,ℓ^∞} metric.

The unadjusted Langevin algorithm is commonly used to sample probability distributions in extremely high-dimensional settings. However, existing analyses of the algorithm for strongly log-concave distributions suggest that, as the dimension $d$ of the problem increases, the number of iterations required to ensure convergence within a desired error in the $W_2$ metric scales in proportion to $d$ or $\sqrt{d}$. In this paper, we argue that, despite this poor scaling of the $W_2$ error for the full set of variables, the behavior for a small number of variables can be significantly better: a number of iterations proportional to $K$, up to logarithmic terms in $d$, often suffices for the algorithm to converge to within a desired $W_2$ error for all $K$-marginals. We refer to this effect as delocalization of bias. We show that the delocalization effect does not hold universally and prove its validity for Gaussian distributions and strongly log-concave distributions with certain sparse interactions. Our analysis relies on a novel $W_{2,\ell^\infty}$ metric to measure convergence. A key technical challenge we address is the lack of a one-step contraction property in this metric. Finally, we use asymptotic arguments to explore potential generalizations of the delocalization effect beyond the Gaussian and sparse interactions setting.

View on arXiv PDF

Similar