OCLGJan 27, 2022

Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(ε^{-7/4})$ Complexity

arXiv:2201.11411v436 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses a theoretical bottleneck in optimization algorithms for machine learning researchers, offering a simpler and more efficient method for nonconvex problems, though it is incremental as it builds on classical techniques.

The paper tackles the problem of achieving faster convergence to approximate first-order stationary points in nonconvex optimization with Lipschitz continuous gradient and Hessian, proposing restarted accelerated gradient methods that achieve an $O(ε^{-7/4})$ complexity without polylogarithmic factors, improving over prior methods by an $O(\log(1/ε))$ factor.

This paper studies accelerated gradient methods for nonconvex optimization with Lipschitz continuous gradient and Hessian. We propose two simple accelerated gradient methods, restarted accelerated gradient descent (AGD) and restarted heavy ball (HB) method, and establish that our methods achieve an $ε$-approximate first-order stationary point within $O(ε^{-7/4})$ number of gradient evaluations by elementary proofs. Theoretically, our complexity does not hide any polylogarithmic factors, and thus it improves over the best known one by the $O(\log\frac{1}ε)$ factor. Our algorithms are simple in the sense that they only consist of Nesterov's classical AGD or Polyak's HB iterations, as well as a restart mechanism. They do not invoke negative curvature exploitation or minimization of regularized surrogate functions as the subroutines. In contrast with existing analysis, our elementary proofs use less advanced techniques and do not invoke the analysis of strongly convex AGD or HB. Code is avaliable at https://github.com/lihuanML/RestartAGD.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes