OCLGAGCVDSSep 23, 2021

Generalisations and improvements of New Q-Newton's method Backtracking

arXiv:2109.11395v1
Originality Synthesis-oriented
AI Analysis

This work provides incremental improvements to optimization algorithms, potentially benefiting researchers in numerical methods and machine learning.

The paper generalizes New Q-Newton's method Backtracking by introducing a flexible framework that allows parameters like τ to be less than 1 and orthonormal bases not necessarily eigenvectors, extending its applicability to optimization problems.

In this paper, we propose a general framework for the algorithm New Q-Newton's method Backtracking, developed in the author's previous work. For a symmetric, square real matrix $A$, we define $minsp(A):=\min _{||e||=1} ||Ae||$. Given a $C^2$ cost function $f:\mathbb{R}^m\rightarrow \mathbb{R}$ and a real number $0<τ$, as well as $m+1$ fixed real numbers $δ_0,\ldots ,δ_m$, we define for each $x\in \mathbb{R}^m$ with $\nabla f(x)\not= 0$ the following quantities: $κ:=\min _{i\not= j}|δ_i-δ_j|$; $A(x):=\nabla ^2f(x)+δ||\nabla f(x)||^τId$, where $δ$ is the first element in the sequence $\{δ_0,\ldots ,δ_m\}$ for which $minsp(A(x))\geq κ||\nabla f(x)||^τ$; $e_1(x),\ldots ,e_m(x)$ are an orthonormal basis of $\mathbb{R}^m$, chosen appropriately; $w(x)=$ the step direction, given by the formula: $$w(x)=\sum _{i=1}^m\frac{<\nabla f(x),e_i(x)>}{||A(x)e_i(x)||}e_i(x);$$ (we can also normalise by $w(x)/\max \{1,||w(x)||\}$ when needed) $γ(x)>0$ learning rate chosen by Backtracking line search so that Armijo's condition is satisfied: $$f(x-γ(x)w(x))-f(x)\leq -\frac{1}{3}γ(x)<\nabla f(x),w(x)>.$$ The update rule for our algorithm is $x\mapsto H(x)=x-γ(x)w(x)$. In New Q-Newton's method Backtracking, the choices are $τ=1+α>1$ and $e_1(x),\ldots ,e_m(x)$'s are eigenvectors of $\nabla ^2f(x)$. In this paper, we allow more flexibility and generality, for example $τ$ can be chosen to be $<1$ or $e_1(x),\ldots ,e_m(x)$'s are not necessarily eigenvectors of $\nabla ^2f(x)$. New Q-Newton's method Backtracking (as well as Backtracking gradient descent) is a special case, and some versions have flavours of quasi-Newton's methods. Several versions allow good theoretical guarantees. An application to solving systems of polynomial equations is given.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes