Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD
This work provides a foundational theoretical analysis for optimization algorithms, offering a framework to assess the optimality of step size choices in SGD, which is incremental in refining convergence bounds.
The paper tackles the problem of establishing a tight lower bound on the expected convergence rate of Stochastic Gradient Descent (SGD) for strongly convex functions with diminishing step sizes, proving that recent step size sequences are within a factor of 32 of optimal, independent of dimension, and improving upon existing lower bounds by a factor of 775 times the dimension.
We study the convergence of Stochastic Gradient Descent (SGD) for strongly convex objective functions. We prove for all $t$ a lower bound on the expected convergence rate after the $t$-th SGD iteration; the lower bound is over all possible sequences of diminishing step sizes. It implies that recently proposed sequences of step sizes at ICML 2018 and ICML 2019 are {\em universally} close to optimal in that the expected convergence rate after {\em each} iteration is within a factor $32$ of our lower bound. This factor is independent of dimension $d$. We offer a framework for comparing with lower bounds in state-of-the-art literature and when applied to SGD for strongly convex objective functions our lower bound is a significant factor $775\cdot d$ larger compared to existing work.