MLLGSTCOOct 17, 2017

Convergence diagnostics for stochastic gradient descent with constant step size

arXiv:1710.06382v214 citations
Originality Incremental advance
AI Analysis

This provides a tool for optimizing stochastic gradient descent convergence in machine learning, though it is incremental as it builds on existing methods.

The paper tackles the problem of detecting when stochastic gradient descent transitions from a transient phase to a stationary phase, developing a diagnostic test that activates in the convergence region and leads to a new variant that halves the learning rate upon detection, achieving performance comparable to state-of-the-art in many settings.

Many iterative procedures in stochastic optimization exhibit a transient phase followed by a stationary phase. During the transient phase the procedure converges towards a region of interest, and during the stationary phase the procedure oscillates in that region, commonly around a single point. In this paper, we develop a statistical diagnostic test to detect such phase transition in the context of stochastic gradient descent with constant learning rate. We present theory and experiments suggesting that the region where the proposed diagnostic is activated coincides with the convergence region. For a class of loss functions, we derive a closed-form solution describing such region. Finally, we suggest an application to speed up convergence of stochastic gradient descent by halving the learning rate each time stationarity is detected. This leads to a new variant of stochastic gradient descent, which in many settings is comparable to state-of-art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes