LGOCMLSep 6, 2018

Escaping Saddle Points in Constrained Optimization

arXiv:1809.02162v261 citations
Originality Incremental advance
AI Analysis

This addresses a key challenge in nonconvex optimization for machine learning and AI, though it is incremental as it builds on existing saddle-point escape methods by extending them to constrained settings.

The paper tackles the problem of escaping saddle points in smooth nonconvex constrained optimization by proposing a framework that converges to a second-order stationary point, achieving this in at most O(max{ε^{-2}, ρ^{-3}γ^{-3}}) iterations under certain conditions.

In this paper, we study the problem of escaping from saddle points in smooth nonconvex optimization problems subject to a convex set $\mathcal{C}$. We propose a generic framework that yields convergence to a second-order stationary point of the problem, if the convex set $\mathcal{C}$ is simple for a quadratic objective function. Specifically, our results hold if one can find a $ρ$-approximate solution of a quadratic program subject to $\mathcal{C}$ in polynomial time, where $ρ<1$ is a positive constant that depends on the structure of the set $\mathcal{C}$. Under this condition, we show that the sequence of iterates generated by the proposed framework reaches an $(ε,γ)$-second order stationary point (SOSP) in at most $\mathcal{O}(\max\{ε^{-2},ρ^{-3}γ^{-3}\})$ iterations. We further characterize the overall complexity of reaching an SOSP when the convex set $\mathcal{C}$ can be written as a set of quadratic constraints and the objective function Hessian has a specific structure over the convex set $\mathcal{C}$. Finally, we extend our results to the stochastic setting and characterize the number of stochastic gradient and Hessian evaluations to reach an $(ε,γ)$-SOSP.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes