Generalized Momentum-Based Methods: A Hamiltonian Perspective
This work provides a theoretical framework for analyzing momentum methods in optimization, which is incremental but useful for researchers in machine learning and optimization.
The authors tackled the problem of generalizing momentum-based optimization methods like Nesterov's accelerated gradient descent and Polyak's heavy ball method by using a Hamiltonian perspective, resulting in a unified nonasymptotic convergence analysis for convex and nonconvex settings.
We take a Hamiltonian-based perspective to generalize Nesterov's accelerated gradient descent and Polyak's heavy ball method to a broad class of momentum methods in the setting of (possibly) constrained minimization in Euclidean and non-Euclidean normed vector spaces. Our perspective leads to a generic and unifying nonasymptotic analysis of convergence of these methods in both the function value (in the setting of convex optimization) and in norm of the gradient (in the setting of unconstrained, possibly nonconvex, optimization). Our approach relies upon a time-varying Hamiltonian that produces generalized momentum methods as its equations of motion. The convergence analysis for these methods is intuitive and is based on the conserved quantities of the time-dependent Hamiltonian.