OC LGFeb 10

Step-Size Stability in Stochastic Optimization: A Theoretical Perspective

Fabian Schaipp, Robert M. Gower, Adrien Taylor

arXiv:2602.09842v14.61 citationsh-index: 6

Originality Incremental advance

AI Analysis

This work addresses the stability of optimization algorithms for machine learning practitioners, offering theoretical insights into adaptive step-size methods, though it is incremental as it builds on existing analysis.

The paper tackles the problem of step-size sensitivity in stochastic optimization by theoretically analyzing how performance degrades with large step sizes, showing that adaptive methods like SPS or NGN are more robust than SGD and providing bounds that match experimental results.

We present a theoretical analysis of stochastic optimization methods in terms of their sensitivity with respect to the step size. We identify a key quantity that, for each method, describes how the performance degrades as the step size becomes too large. For convex problems, we show that this quantity directly impacts the suboptimality bound of the method. Most importantly, our analysis provides direct theoretical evidence that adaptive step-size methods, such as SPS or NGN, are more robust than SGD. This allows us to quantify the advantage of these adaptive methods beyond empirical evaluation. Finally, we show through experiments that our theoretical bound qualitatively mirrors the actual performance as a function of the step size, even for nonconvex problems.

View on arXiv PDF

Similar