OCLGSep 5, 2023

First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians

arXiv:2309.02412v118 citationsh-index: 14
AI Analysis

This work addresses optimization problems for researchers and practitioners in machine learning and AI, offering incremental improvements in computational efficiency for non-convex settings.

The authors tackled non-convex optimization by developing first-order and zero-order implementations of the Cubically regularized Newton method, achieving improved complexity bounds of O(n^{1/2} ε^{-3/2}) for Hessian-free and O(n^{3/2} ε^{-3/2}) for derivative-free methods.

In this work, we develop first-order (Hessian-free) and zero-order (derivative-free) implementations of the Cubically regularized Newton method for solving general non-convex optimization problems. For that, we employ finite difference approximations of the derivatives. We use a special adaptive search procedure in our algorithms, which simultaneously fits both the regularization constant and the parameters of the finite difference approximations. It makes our schemes free from the need to know the actual Lipschitz constants. Additionally, we equip our algorithms with the lazy Hessian update that reuse a previously computed Hessian approximation matrix for several iterations. Specifically, we prove the global complexity bound of $\mathcal{O}( n^{1/2} ε^{-3/2})$ function and gradient evaluations for our new Hessian-free method, and a bound of $\mathcal{O}( n^{3/2} ε^{-3/2} )$ function evaluations for the derivative-free method, where $n$ is the dimension of the problem and $ε$ is the desired accuracy for the gradient norm. These complexity bounds significantly improve the previously known ones in terms of the joint dependence on $n$ and $ε$, for the first-order and zeroth-order non-convex optimization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes