OC LGJun 16, 2025

Gradient-Normalized Smoothness for Optimization with Approximate Hessians

Andrei Semenov, Martin Jaggi, Nikita Doikov

arXiv:2506.13710v17.12 citationsh-index: 12Has Code

Originality Highly original

AI Analysis

This work addresses the challenge of global convergence in optimization for researchers and practitioners, offering incremental improvements by extending existing methods to broader classes with automated rate guarantees.

The paper tackles the problem of achieving fast global convergence rates for optimization algorithms using approximate second-order information by introducing Gradient-Normalized Smoothness, which provides universal guarantees and recovers state-of-the-art rates for various function classes, including logistic regression and non-convex problems.

In this work, we develop new optimization algorithms that use approximate second-order information combined with the gradient regularization technique to achieve fast global convergence rates for both convex and non-convex objectives. The key innovation of our analysis is a novel notion called Gradient-Normalized Smoothness, which characterizes the maximum radius of a ball around the current point that yields a good relative approximation of the gradient field. Our theory establishes a natural intrinsic connection between Hessian approximation and the linearization of the gradient. Importantly, Gradient-Normalized Smoothness does not depend on the specific problem class of the objective functions, while effectively translating local information about the gradient field and Hessian approximation into the global behavior of the method. This new concept equips approximate second-order algorithms with universal global convergence guarantees, recovering state-of-the-art rates for functions with Hölder-continuous Hessians and third derivatives, quasi-self-concordant functions, as well as smooth classes in first-order optimization. These rates are achieved automatically and extend to broader classes, such as generalized self-concordant functions. We demonstrate direct applications of our results for global linear rates in logistic regression and softmax problems with approximate Hessians, as well as in non-convex optimization using Fisher and Gauss-Newton approximations.

View on arXiv PDF Code

Similar