STLGMLFeb 3, 2019

Quantitative Weak Convergence for Discrete Stochastic Processes

arXiv:1902.00832v29 citations
AI Analysis

This provides a theoretical foundation for understanding convergence in gradient-based optimization algorithms, which is incremental but important for machine learning practitioners.

The paper tackles the problem of quantifying convergence rates for Langevin-like stochastic processes, including stochastic gradient descent, by proving that iterates converge to an invariant distribution at a rate of $ ilde{O}(1/\sqrt{k})$, which is tight up to log factors, with a reduction to a quantitative Central Limit Theorem for quadratic potentials.

In this paper, we quantitative convergence in $W_2$ for a family of Langevin-like stochastic processes that includes stochastic gradient descent and related gradient-based algorithms. Under certain regularity assumptions, we show that the iterates of these stochastic processes converge to an invariant distribution at a rate of $\tilde{O}\lrp{1/\sqrt{k}}$ where $k$ is the number of steps; this rate is provably tight up to log factors. Our result reduces to a quantitative form of the classical Central Limit Theorem in the special case when the potential is quadratic.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes