OCLGOct 10, 2018

Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD

arXiv:1810.04723v437 citations
Originality Highly original
AI Analysis

This work provides a foundational theoretical analysis for optimization algorithms, offering a framework to assess the optimality of step size choices in SGD, which is incremental in refining convergence bounds.

The paper tackles the problem of establishing a tight lower bound on the expected convergence rate of Stochastic Gradient Descent (SGD) for strongly convex functions with diminishing step sizes, proving that recent step size sequences are within a factor of 32 of optimal, independent of dimension, and improving upon existing lower bounds by a factor of 775 times the dimension.

We study the convergence of Stochastic Gradient Descent (SGD) for strongly convex objective functions. We prove for all $t$ a lower bound on the expected convergence rate after the $t$-th SGD iteration; the lower bound is over all possible sequences of diminishing step sizes. It implies that recently proposed sequences of step sizes at ICML 2018 and ICML 2019 are {\em universally} close to optimal in that the expected convergence rate after {\em each} iteration is within a factor $32$ of our lower bound. This factor is independent of dimension $d$. We offer a framework for comparing with lower bounds in state-of-the-art literature and when applied to SGD for strongly convex objective functions our lower bound is a significant factor $775\cdot d$ larger compared to existing work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes