ST LG MLFeb 3, 2019

Quantitative Weak Convergence for Discrete Stochastic Processes

Xiang Cheng, Peter L. Bartlett, Michael I. Jordan

arXiv:1902.00832v28.69 citations

Originality Incremental advance

AI Analysis

This provides a theoretical foundation for understanding convergence in gradient-based optimization algorithms, which is incremental but important for machine learning practitioners.

The paper tackles the problem of quantifying convergence rates for Langevin-like stochastic processes, including stochastic gradient descent, by proving that iterates converge to an invariant distribution at a rate of $ ilde{O}(1/\sqrt{k})$, which is tight up to log factors, with a reduction to a quantitative Central Limit Theorem for quadratic potentials.

In this paper, we quantitative convergence in $W_2$ for a family of Langevin-like stochastic processes that includes stochastic gradient descent and related gradient-based algorithms. Under certain regularity assumptions, we show that the iterates of these stochastic processes converge to an invariant distribution at a rate of $\tilde{O}\lrp{1/\sqrt{k}}$ where $k$ is the number of steps; this rate is provably tight up to log factors. Our result reduces to a quantitative form of the classical Central Limit Theorem in the special case when the potential is quadratic.

View on arXiv PDF

Similar