OCLGJun 19, 2024

Open Problem: Anytime Convergence Rate of Gradient Descent

arXiv:2406.13888v13 citations
Originality Synthesis-oriented
AI Analysis

This addresses a foundational open problem in optimization theory, with potential implications for all ML/AI practitioners, but it is incremental as it builds on recent results without providing a solution.

The paper investigates whether any stepsize schedule can accelerate gradient descent's classic O(1/T) convergence rate at any stopping time, revealing that current acceleration methods can cause large errors indefinitely.

Recent results show that vanilla gradient descent can be accelerated for smooth convex objectives, merely by changing the stepsize sequence. We show that this can lead to surprisingly large errors indefinitely, and therefore ask: Is there any stepsize schedule for gradient descent that accelerates the classic $\mathcal{O}(1/T)$ convergence rate, at \emph{any} stopping time $T$?

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes