MLLGOCJul 20, 2017

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

arXiv:1707.06618v3234 citations
Originality Incremental advance
AI Analysis

It provides faster convergence guarantees for nonconvex optimization, which is incremental but important for machine learning practitioners dealing with complex models.

The paper tackles nonconvex finite-sum optimization by analyzing the global convergence of Langevin dynamics algorithms, showing improved gradient complexities for GLD, SGLD, and SVRG-LD, with specific rates such as $ ilde Oig(nd/(λε) ig)$ for GLD.

We present a unified framework to analyze the global convergence of Langevin dynamics based algorithms for nonconvex finite-sum optimization with $n$ component functions. At the core of our analysis is a direct analysis of the ergodicity of the numerical approximations to Langevin dynamics, which leads to faster convergence rates. Specifically, we show that gradient Langevin dynamics (GLD) and stochastic gradient Langevin dynamics (SGLD) converge to the almost minimizer within $\tilde O\big(nd/(λε) \big)$ and $\tilde O\big(d^7/(λ^5ε^5) \big)$ stochastic gradient evaluations respectively, where $d$ is the problem dimension, and $λ$ is the spectral gap of the Markov chain generated by GLD. Both results improve upon the best known gradient complexity results (Raginsky et al., 2017). Furthermore, for the first time we prove the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (SVRG-LD) to the almost minimizer within $\tilde O\big(\sqrt{n}d^5/(λ^4ε^{5/2})\big)$ stochastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime. Our theoretical analyses shed some light on using Langevin dynamics based algorithms for nonconvex optimization with provable guarantees.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes