OCLGMLMar 14, 2016

A Variational Perspective on Accelerated Methods in Optimization

arXiv:1603.04245v1649 citations
Originality Incremental advance
AI Analysis

This provides a unifying theoretical framework for accelerated optimization methods, which is incremental but clarifies their scope for researchers in optimization and machine learning.

The paper tackles the problem of understanding the natural scope of accelerated optimization methods by studying them from a continuous-time perspective, showing that a Bregman Lagrangian generates a large class of such methods and that Nesterov's technique can be viewed as a systematic way to derive discrete-time algorithms from continuous-time curves.

Accelerated gradient methods play a central role in optimization, achieving optimal rates in many settings. While many generalizations and extensions of Nesterov's original acceleration method have been proposed, it is not yet clear what is the natural scope of the acceleration concept. In this paper, we study accelerated methods from a continuous-time perspective. We show that there is a Lagrangian functional that we call the \emph{Bregman Lagrangian} which generates a large class of accelerated methods in continuous time, including (but not limited to) accelerated gradient descent, its non-Euclidean extension, and accelerated higher-order gradient methods. We show that the continuous-time limit of all of these methods correspond to traveling the same curve in spacetime at different speeds. From this perspective, Nesterov's technique and many of its generalizations can be viewed as a systematic way to go from the continuous-time curves generated by the Bregman Lagrangian to a family of discrete-time accelerated algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes