LGOct 29, 2021

Does Momentum Help? A Sample Complexity Analysis

arXiv:2110.15547v3
Originality Synthesis-oriented
AI Analysis

This work addresses a fundamental uncertainty in optimization theory for researchers, showing that momentum may not help in a broad class of problems, which is incremental as it clarifies prior conflicting claims.

The paper tackles the question of whether momentum methods improve sample complexity in stochastic optimization, specifically for quadratic optimization, and finds that Stochastic Heavy Ball and Nesterov's Accelerated Stochastic Gradient do not outperform vanilla SGD, as both achieve the same lower bound.

Stochastic Heavy Ball (SHB) and Nesterov's Accelerated Stochastic Gradient (ASG) are popular momentum methods in stochastic optimization. While benefits of such acceleration ideas in deterministic settings are well understood, their advantages in stochastic optimization is still unclear. In fact, in some specific instances, it is known that momentum does not help in the sample complexity sense. Our work shows that a similar outcome actually holds for the whole of quadratic optimization. Specifically, we obtain a lower bound on the sample complexity of SHB and ASG for this family and show that the same bound can be achieved by the vanilla SGD. We note that there exist results claiming the superiority of momentum based methods in quadratic optimization, but these are based on one-sided or flawed analyses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes