OCMLMar 11, 2019

Conformal Symplectic and Relativistic Optimization

arXiv:1903.04100v777 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the need for more efficient and stable optimization algorithms in machine learning, though it appears incremental as it builds on existing methods.

The paper tackled the problem of understanding and improving momentum-based optimization methods by analyzing their symplectic structure and proposing a new algorithm based on dissipative relativistic systems. The result is a generalized method that unifies Nesterov's accelerated gradient and Polyak's heavy ball, offering potential stability and speed advantages without extra cost.

Arguably, the two most popular accelerated or momentum-based optimization methods in machine learning are Nesterov's accelerated gradient and Polyaks's heavy ball, both corresponding to different discretizations of a particular second order differential equation with friction. Such connections with continuous-time dynamical systems have been instrumental in demystifying acceleration phenomena in optimization. Here we study structure-preserving discretizations for a certain class of dissipative (conformal) Hamiltonian systems, allowing us to analyze the symplectic structure of both Nesterov and heavy ball, besides providing several new insights into these methods. Moreover, we propose a new algorithm based on a dissipative relativistic system that normalizes the momentum and may result in more stable/faster optimization. Importantly, such a method generalizes both Nesterov and heavy ball, each being recovered as distinct limiting cases, and has potential advantages at no additional cost.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes