LGDSPRJul 29, 2024

Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning

arXiv:2407.20209v35 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses the challenge of generalization in modern machine learning by analyzing optimization dynamics, though it is incremental as it builds on existing stability concepts.

The paper tackles the problem of understanding which global minima stochastic gradient descent (SGD) converges to in overparameterized learning, by characterizing dynamically stable and unstable minima and proving that a Lyapunov exponent determines SGD accumulation.

For overparameterized optimization tasks, such as those found in modern machine learning, global minima are generally not unique. In order to understand generalization in these settings, it is vital to study to which minimum an optimization algorithm converges. The possibility of having minima that are unstable under the dynamics imposed by the optimization algorithm limits the potential minima that the algorithm can find. In this paper, we characterize the global minima that are dynamically stable/unstable for both deterministic and stochastic gradient descent (SGD). In particular, we introduce a characteristic Lyapunov exponent that depends on the local dynamics around a global minimum and rigorously prove that the sign of this Lyapunov exponent determines whether SGD can accumulate at the respective global minimum.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes