Inertial Newton Algorithms Avoiding Strict Saddle Points
This addresses the challenge of non-convex optimization for machine learning practitioners by providing insights into algorithm behavior near critical points, though it appears incremental as it builds on existing methods.
The paper tackles the problem of second-order algorithms escaping strict saddle points in non-convex optimization by combining Newton's method and inertial gradient descent, showing that these methods almost always avoid such points with theoretical and numerical support.
We study the asymptotic behavior of second-order algorithms mixing Newton's method and inertial gradient descent in non-convex landscapes. We show that, despite the Newtonian behavior of these methods, they almost always escape strict saddle points. We also evidence the role played by the hyper-parameters of these methods in their qualitative behavior near critical points. The theoretical results are supported by numerical illustrations.