OCLGNAMLNov 18, 2019

Coordinate-wise Armijo's condition

arXiv:1911.07820v12 citations
Originality Synthesis-oriented
AI Analysis

This work addresses optimization efficiency for separable functions, but it is incremental as it extends existing backtracking gradient descent methods.

The paper tackles the problem of optimizing coordinate-wise sum functions by proposing a coordinate-wise variant of Armijo's condition, showing through an example that it offers advantages over the standard Armijo's condition.

Let $z=(x,y)$ be coordinates for the product space $\mathbb{R}^{m_1}\times \mathbb{R}^{m_2}$. Let $f:\mathbb{R}^{m_1}\times \mathbb{R}^{m_2}\rightarrow \mathbb{R}$ be a $C^1$ function, and $\nabla f=(\partial _xf,\partial _yf)$ its gradient. Fix $0<α<1$. For a point $(x,y) \in \mathbb{R}^{m_1}\times \mathbb{R}^{m_2}$, a number $δ>0$ satisfies Armijo's condition at $(x,y)$ if the following inequality holds: \begin{eqnarray*} f(x-δ\partial _xf,y-δ\partial _yf)-f(x,y)\leq -αδ(||\partial _xf||^2+||\partial _yf||^2). \end{eqnarray*} When $f(x,y)=f_1(x)+f_2(y)$ is a coordinate-wise sum map, we propose the following {\bf coordinate-wise} Armijo's condition. Fix again $0<α<1$. A pair of positive numbers $δ_1,δ_2>0$ satisfies the coordinate-wise variant of Armijo's condition at $(x,y)$ if the following inequality holds: \begin{eqnarray*} [f_1(x-δ_1\nabla f_1(x))+f_2(y-δ_2\nabla f_2(y))]-[f_1(x)+f_2(y)]\leq -α(δ_1||\nabla f_1(x)||^2+δ_2||\nabla f_2(y)||^2). \end{eqnarray*} We then extend results in our recent previous results, on Backtracking Gradient Descent and some variants, to this setting. We show by an example the advantage of using coordinate-wise Armijo's condition over the usual Armijo's condition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes