STPRMLSep 14, 2016

Stochastic Heavy Ball

arXiv:1609.04228v2115 citations
Originality Synthesis-oriented
AI Analysis

This work addresses optimization challenges in large-scale machine learning by extending a classic method to stochastic scenarios, though it appears incremental as it adapts an existing approach rather than introducing a new paradigm.

The paper tackles the problem of adapting the deterministic Heavy-ball optimization method to stochastic settings where only unbiased gradient evaluations are available, providing convergence results for non-convex, convex, and strongly convex functions, including non-asymptotic analyses and limit theorems.

This paper deals with a natural stochastic optimization procedure derived from the so-called Heavy-ball method differential equation, which was introduced by Polyak in the 1960s with his seminal contribution [Pol64]. The Heavy-ball method is a second-order dynamics that was investigated to minimize convex functions f . The family of second-order methods recently received a large amount of attention, until the famous contribution of Nesterov [Nes83], leading to the explosion of large-scale optimization problems. This work provides an in-depth description of the stochastic heavy-ball method, which is an adaptation of the deterministic one when only unbiased evalutions of the gradient are available and used throughout the iterations of the algorithm. We first describe some almost sure convergence results in the case of general non-convex coercive functions f . We then examine the situation of convex and strongly convex potentials and derive some non-asymptotic results about the stochastic heavy-ball method. We end our study with limit theorems on several rescaled algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes