Saddle Point Evasion via Curvature-Regularized Gradient Dynamics
This addresses a fundamental challenge in machine learning and control for researchers and practitioners dealing with high-dimensional nonconvex optimization, though it appears incremental as it builds on existing gradient-based methods.
The paper tackled the problem of escaping saddle points in nonconvex optimization by introducing Curvature-Regularized Gradient Dynamics (CRGD), which deterministically achieves controllable convergence rates to second-order stationary points, with numerical experiments showing CRGD escapes saddle points across all tested configurations and escape time decreasing with the eigenvalue gap.
Nonconvex optimization underlies many modern machine learning and control tasks, where saddle points pose the dominant obstacle to reliable convergence in high-dimensional settings. Escaping these saddle points deterministically and at a controllable rate remains an open challenge: gradient descent is blind to curvature, stochastic perturbation methods lack deterministic guarantees, and Newton-type approaches suffer from Hessian singularity. We present Curvature-Regularized Gradient Dynamics (CRGD), which augments the objective with a smooth penalty on the most negative Hessian eigenvalue, yielding an augmented cost that serves as an optimization Lyapunov function with user-selectable convergence rates to second-order stationary points. Numerical experiments on a nonconvex matrix factorization example confirm that CRGD escapes saddle points across all tested configurations, with escape time that decreases with the eigenvalue gap, in contrast to gradient descent, whose escape time grows inversely with the gap.