Understanding Nesterov's Acceleration via Proximal Point Method
This provides conceptual clarity for researchers in optimization, though it is incremental as it builds on known methods.
The paper tackled the problem of understanding Nesterov's accelerated gradient method by deriving it as an approximation of the proximal point method, resulting in simple derivations and convergence analyses that unify existing variants.
The proximal point method (PPM) is a fundamental method in optimization that is often used as a building block for designing optimization algorithms. In this work, we use the PPM method to provide conceptually simple derivations along with convergence analyses of different versions of Nesterov's accelerated gradient method (AGM). The key observation is that AGM is a simple approximation of PPM, which results in an elementary derivation of the update equations and stepsizes of AGM. This view also leads to a transparent and conceptually simple analysis of AGM's convergence by using the analysis of PPM. The derivations also naturally extend to the strongly convex case. Ultimately, the results presented in this paper are of both didactic and conceptual value; they unify and explain existing variants of AGM while motivating other accelerated methods for practically relevant settings.