OCLGNov 10, 2018

New Convergence Aspects of Stochastic Gradient Algorithms

arXiv:1811.12403v278 citations
AI Analysis

This work provides incremental theoretical improvements for machine learning practitioners by broadening the conditions under which SGD and its variants are guaranteed to converge.

The paper tackles the convergence analysis of stochastic gradient descent (SGD) by extending it to strongly convex objective functions without assuming bounded gradients, showing convergence under diminishing learning rates with sum to infinity, and also proves convergence for the asynchronous Hogwild! algorithm in this regime.

The classical convergence analysis of SGD is carried out under the assumption that the norm of the stochastic gradient is uniformly bounded. While this might hold for some loss functions, it is violated for cases where the objective function is strongly convex. In Bottou et al. (2018), a new analysis of convergence of SGD is performed under the assumption that stochastic gradients are bounded with respect to the true gradient norm. We show that for stochastic problems arising in machine learning such bound always holds; and we also propose an alternative convergence analysis of SGD with diminishing learning rate regime. We then move on to the asynchronous parallel setting, and prove convergence of Hogwild! algorithm in the same regime in the case of diminished learning rate. It is well-known that SGD converges if a sequence of learning rates $\{η_t\}$ satisfies $\sum_{t=0}^\infty η_t \rightarrow \infty$ and $\sum_{t=0}^\infty η^2_t < \infty$. We show the convergence of SGD for strongly convex objective function without using bounded gradient assumption when $\{η_t\}$ is a diminishing sequence and $\sum_{t=0}^\infty η_t \rightarrow \infty$. In other words, we extend the current state-of-the-art class of learning rates satisfying the convergence of SGD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes