OCLGMLMay 23, 2025

New Tight Bounds for SGD without Variance Assumption: A Computer-Aided Lyapunov Analysis

arXiv:2505.17965v16 citationsh-index: 22Has Code
Originality Incremental advance
AI Analysis

This work addresses a foundational issue in optimization theory for machine learning practitioners by providing more practical and tighter convergence guarantees for SGD, though it is incremental within an existing research line.

The paper tackles the problem of analyzing Stochastic Gradient Descent (SGD) without relying on variance assumptions, which are often impractical, by proving new theoretical bounds based on convexity and smoothness. It improves state-of-the-art results and extends validity to larger step-sizes, with empirical evidence showing tightness of the bias term.

The analysis of Stochastic Gradient Descent (SGD) often relies on making some assumption on the variance of the stochastic gradients, which is usually not satisfied or difficult to verify in practice. This paper contributes to a recent line of works which attempt to provide guarantees without making any variance assumption, leveraging only the (strong) convexity and smoothness of the loss functions. In this context, we prove new theoretical bounds derived from the monotonicity of a simple Lyapunov energy, improving the current state-of-the-art and extending their validity to larger step-sizes. Our theoretical analysis is backed by a Performance Estimation Problem analysis, which allows us to claim that, empirically, the bias term in our bounds is tight within our framework.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes