A non-autonomous center-stable set theorem for saddle avoidance in optimization

arXiv:2603.0278211.4h-index: 3

Predicted impact top 73% in OC · last 90 daysOriginality Incremental advance

AI Analysis

Provides a theoretical foundation for saddle avoidance in optimization algorithms with non-constant step sizes, addressing a gap in existing dynamical systems analysis.

The authors prove a non-autonomous Center-Stable Set Theorem to show that gradient descent (Euclidean and Riemannian) and the proximal point method avoid strict saddle points even with vanishing step sizes, without requiring Lipschitz gradients or isolated saddles.

Optimization algorithms are unlikely to converge to strict saddle points. Proofs to that effect rely on the Center-Stable Manifold Theorem (CSMT), casting algorithms as dynamical systems: $x_{k+1} = g_k(x_k)$. In its standard form, the CSMT is limited to autonomous systems (the maps $g_k$ are all the same). To study algorithms such as gradient descent with non-constant step-size schedules, we need a non-autonomous CSMT. There are a few, but they are unable to handle, for example, vanishing step sizes. To cover such scenarios, we establish a new Center-Stable Set Theorem (CSST) for non-autonomous systems. We use it to prove saddle avoidance for gradient descent (Euclidean and Riemannian) and for the proximal point method, without assuming Lipschitz gradients or isolated saddles, and allowing vanishing step sizes.

View on arXiv PDF

Similar