OCLGNACOMLJul 1, 2016

Randomized block proximal damped Newton method for composite self-concordant minimization

arXiv:1607.00101v1
Originality Incremental advance
AI Analysis

This work addresses scalability issues in optimization methods for machine learning applications, offering an incremental improvement with new convergence guarantees.

The paper tackles the high computational cost per iteration of proximal damped Newton methods for composite self-concordant minimization by proposing a randomized block proximal damped Newton method, which reduces cost and demonstrates promise in experiments on large-scale regularized logistic regression problems.

In this paper we consider the composite self-concordant (CSC) minimization problem, which minimizes the sum of a self-concordant function $f$ and a (possibly nonsmooth) proper closed convex function $g$. The CSC minimization is the cornerstone of the path-following interior point methods for solving a broad class of convex optimization problems. It has also found numerous applications in machine learning. The proximal damped Newton (PDN) methods have been well studied in the literature for solving this problem that enjoy a nice iteration complexity. Given that at each iteration these methods typically require evaluating or accessing the Hessian of $f$ and also need to solve a proximal Newton subproblem, the cost per iteration can be prohibitively high when applied to large-scale problems. Inspired by the recent success of block coordinate descent methods, we propose a randomized block proximal damped Newton (RBPDN) method for solving the CSC minimization. Compared to the PDN methods, the computational cost per iteration of RBPDN is usually significantly lower. The computational experiment on a class of regularized logistic regression problems demonstrate that RBPDN is indeed promising in solving large-scale CSC minimization problems. The convergence of RBPDN is also analyzed in the paper. In particular, we show that RBPDN is globally convergent when $g$ is Lipschitz continuous. It is also shown that RBPDN enjoys a local linear convergence. Moreover, we show that for a class of $g$ including the case where $g$ is Lipschitz differentiable, RBPDN enjoys a global linear convergence. As a striking consequence, it shows that the classical damped Newton methods [22,40] and the PDN [31] for such $g$ are globally linearly convergent, which was previously unknown in the literature. Moreover, this result can be used to sharpen the existing iteration complexity of these methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes