Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients
This work addresses the need for precise convergence guarantees in MCMC sampling, which is crucial for practitioners in statistics and machine learning, though it is incremental as it builds on existing HMC methods.
The authors tackled the problem of quantifying the mixing time of Metropolized Hamiltonian Monte Carlo (HMC) for sampling from smooth probability densities, providing a non-asymptotic upper bound that shows faster convergence compared to simpler MCMC algorithms like Metropolized random walk or Langevin. They also developed a framework to sharpen mixing time bounds for Markov chains starting far from the target distribution, applying it to improve bounds for these simpler algorithms.
Hamiltonian Monte Carlo (HMC) is a state-of-the-art Markov chain Monte Carlo sampling algorithm for drawing samples from smooth probability densities over continuous spaces. We study the variant most widely used in practice, Metropolized HMC with the Störmer-Verlet or leapfrog integrator, and make two primary contributions. First, we provide a non-asymptotic upper bound on the mixing time of the Metropolized HMC with explicit choices of step-size and number of leapfrog steps. This bound gives a precise quantification of the faster convergence of Metropolized HMC relative to simpler MCMC algorithms such as the Metropolized random walk, or Metropolized Langevin algorithm. Second, we provide a general framework for sharpening mixing time bounds of Markov chains initialized at a substantial distance from the target distribution over continuous spaces. We apply this sharpening device to the Metropolized random walk and Langevin algorithms, thereby obtaining improved mixing time bounds from a non-warm initial distribution.