MLLGNov 6, 2025

High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes

arXiv:2511.03952v11 citationsh-index: 20
Originality Incremental advance
AI Analysis

This provides a rigorous framework for understanding and optimizing SGD variants in high-dimensional machine learning, addressing incremental improvements in algorithm stability and performance.

The paper tackles the problem of comparing Stochastic Gradient Descent (SGD) variants like SGD with Polyak Momentum (SGD-M) and adaptive step-sizes in high-dimensional settings, showing that SGD-M can degrade performance relative to online SGD without proper tuning, while adaptive step-sizes improve convergence by stabilizing dynamics and widening admissible step-size ranges.

We develop a high-dimensional scaling limit for Stochastic Gradient Descent with Polyak Momentum (SGD-M) and adaptive step-sizes. This provides a framework to rigourously compare online SGD with some of its popular variants. We show that the scaling limits of SGD-M coincide with those of online SGD after an appropriate time rescaling and a specific choice of step-size. However, if the step-size is kept the same between the two algorithms, SGD-M will amplify high-dimensional effects, potentially degrading performance relative to online SGD. We demonstrate our framework on two popular learning problems: Spiked Tensor PCA and Single Index Models. In both cases, we also examine online SGD with an adaptive step-size based on normalized gradients. In the high-dimensional regime, this algorithm yields multiple benefits: its dynamics admit fixed points closer to the population minimum and widens the range of admissible step-sizes for which the iterates converge to such solutions. These examples provide a rigorous account, aligning with empirical motivation, of how early preconditioners can stabilize and improve dynamics in settings where online SGD fails.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes