LG OCMay 21, 2024

Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows

Sibylle Marcotte, Rémi Gribonval, Gabriel Peyré

arXiv:2405.12888v112.59 citationsh-index: 6Has CodeICML

Originality Highly original

AI Analysis

This work addresses a theoretical gap in understanding conservation laws for machine learning optimization, which is incremental but provides foundational insights for researchers in optimization and neural network training.

The paper tackles the problem of characterizing conservation laws for momentum-based dynamics in non-Euclidean geometries, proving that these laws exhibit temporal dependence and often result in a 'conservation loss' compared to gradient flows, with specific findings such as fewer laws for linear networks and none for ReLU networks.

Conservation laws are well-established in the context of Euclidean gradient flow dynamics, notably for linear or ReLU neural network training. Yet, their existence and principles for non-Euclidean geometries and momentum-based dynamics remain largely unknown. In this paper, we characterize "all" conservation laws in this general setting. In stark contrast to the case of gradient flows, we prove that the conservation laws for momentum-based dynamics exhibit temporal dependence. Additionally, we often observe a "conservation loss" when transitioning from gradient flow to momentum dynamics. Specifically, for linear networks, our framework allows us to identify all momentum conservation laws, which are less numerous than in the gradient flow case except in sufficiently over-parameterized regimes. With ReLU networks, no conservation law remains. This phenomenon also manifests in non-Euclidean metrics, used e.g. for Nonnegative Matrix Factorization (NMF): all conservation laws can be determined in the gradient flow context, yet none persists in the momentum case.

View on arXiv PDF Code

Similar