OCLGMLJun 2, 2019

Generalized Momentum-Based Methods: A Hamiltonian Perspective

arXiv:1906.00436v366 citations
AI Analysis

This work provides a theoretical framework for analyzing momentum methods in optimization, which is incremental but useful for researchers in machine learning and optimization.

The authors tackled the problem of generalizing momentum-based optimization methods like Nesterov's accelerated gradient descent and Polyak's heavy ball method by using a Hamiltonian perspective, resulting in a unified nonasymptotic convergence analysis for convex and nonconvex settings.

We take a Hamiltonian-based perspective to generalize Nesterov's accelerated gradient descent and Polyak's heavy ball method to a broad class of momentum methods in the setting of (possibly) constrained minimization in Euclidean and non-Euclidean normed vector spaces. Our perspective leads to a generic and unifying nonasymptotic analysis of convergence of these methods in both the function value (in the setting of convex optimization) and in norm of the gradient (in the setting of unconstrained, possibly nonconvex, optimization). Our approach relies upon a time-varying Hamiltonian that produces generalized momentum methods as its equations of motion. The convergence analysis for these methods is intuitive and is based on the conserved quantities of the time-dependent Hamiltonian.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes