OCLGMar 6, 2025

Efficiently Escaping Saddle Points under Generalized Smoothness via Self-Bounding Regularity

arXiv:2503.04712v23 citationsh-index: 2
Originality Highly original
AI Analysis

This addresses a foundational bottleneck in machine learning optimization theory by extending convergence results to more realistic, non-smooth settings, though it is incremental in building on prior work on generalized smoothness.

The paper tackles the problem of optimizing non-convex functions without smoothness assumptions using first-order methods, establishing the first convergence guarantees for reaching second-order stationary points under generalized smoothness conditions.

We study the optimization of non-convex functions that are not necessarily smooth (gradient and/or Hessian are Lipschitz) using first order methods. Smoothness is a restrictive assumption in machine learning in both theory and practice, motivating significant recent work on finding first order stationary points of functions satisfying generalizations of smoothness with first order methods. We develop a novel framework that lets us systematically study the convergence of a large class of first-order optimization algorithms (which we call decrease procedures) under generalizations of smoothness. We instantiate our framework to analyze the convergence of first order optimization algorithms to first and \textit{second} order stationary points under generalizations of smoothness. As a consequence, we establish the first convergence guarantees for first order methods to second order stationary points under generalizations of smoothness. We demonstrate that several canonical examples fall under our framework, and highlight practical implications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes