LGNAMLJun 16, 2020

Hessian-Free High-Resolution Nesterov Acceleration for Sampling

arXiv:2006.09230v410 citations
Originality Highly original
AI Analysis

This work addresses the need for faster sampling methods in machine learning, offering a novel approach with proven acceleration, though it is incremental as it builds on existing optimization techniques.

The paper tackles the problem of accelerating gradient-based MCMC sampling by proposing a diffusion process derived from Nesterov's Accelerated Gradient, achieving quantitative acceleration beyond underdamped Langevin in Wasserstein-2 distance for log-strongly-concave-and-smooth targets.

Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed \citep{shi2021understanding}. This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods. More precisely, we reformulate the optimizer of NAG for strongly convex functions (NAG-SC) as a Hessian-Free High-Resolution ODE, change its high-resolution coefficient to a hyperparameter, inject appropriate noise, and discretize the resulting diffusion process. The acceleration effect of the new hyperparameter is quantified and it is not an artificial one created by time-rescaling. Instead, acceleration beyond underdamped Langevin in $W_2$ distance is quantitatively established for log-strongly-concave-and-smooth targets, at both the continuous dynamics level and the discrete algorithm level. Empirical experiments in both log-strongly-concave and multi-modal cases also numerically demonstrate this acceleration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes