LGJun 3, 2025

Heavy-Ball Momentum Method in Continuous Time and Discretization Error Analysis

arXiv:2506.14806v2h-index: 2
Originality Incremental advance
AI Analysis

This work provides incremental theoretical tools for optimization researchers by bridging discretization errors in momentum methods, with potential applications in deep learning.

The paper tackles the gap between discrete Heavy-Ball momentum optimization and continuous approximations by designing a piece-wise continuous differential equation with explicit discretization error control, enabling error reduction to arbitrary step size order and applying it to analyze implicit regularization in deep learning.

This paper establishes a continuous time approximation, a piece-wise continuous differential equation, for the discrete Heavy-Ball (HB) momentum method with explicit discretization error. Investigating continuous differential equations has been a promising approach for studying the discrete optimization methods. Despite the crucial role of momentum in gradient-based optimization methods, the gap between the original discrete dynamics and the continuous time approximations due to the discretization error has not been comprehensively bridged yet. In this work, we study the HB momentum method in continuous time while putting more focus on the discretization error to provide additional theoretical tools to this area. In particular, we design a first-order piece-wise continuous differential equation, where we add a number of counter terms to account for the discretization error explicitly. As a result, we provide a continuous time model for the HB momentum method that allows the control of discretization error to arbitrary order of the step size. As an application, we leverage it to find a new implicit regularization of the directional smoothness and investigate the implicit bias of HB for diagonal linear networks, indicating how our results can be used in deep learning. Our theoretical findings are further supported by numerical experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes