LG AISep 5, 2017

A Generic Approach for Escaping Saddle points

Sashank J Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, Ruslan Salakhutdinov, Alexander J Smola

arXiv:1709.01434v119.587 citations

Originality Incremental advance

AI Analysis

This addresses a central challenge for practitioners using first-order methods in large-scale nonconvex optimization, though it is incremental as it builds on existing second-order methods.

The paper tackles the problem of saddle points in nonconvex optimization by introducing a generic framework that minimizes expensive Hessian computations while provably converging to second-order critical points, achieving competitive convergence results with empirical performance.

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.

View on arXiv PDF

Similar