OCLGFeb 10, 2020

SPAN: A Stochastic Projected Approximate Newton Method

arXiv:2002.03687v22 citations
AI Analysis

This work addresses the computational bottleneck in Newton methods for optimization, offering a faster alternative for machine learning practitioners, though it appears incremental as it builds on approximate Newton techniques.

The paper tackles the problem of expensive Hessian computation in second-order optimization by proposing SPAN, a stochastic projected approximate Newton method that uses low-rank approximation and stochastic Hessian-vector products, resulting in outperforming existing methods in convergence wall-clock time on benchmark datasets.

Second-order optimization methods have desirable convergence properties. However, the exact Newton method requires expensive computation for the Hessian and its inverse. In this paper, we propose SPAN, a novel approximate and fast Newton method. SPAN computes the inverse of the Hessian matrix via low-rank approximation and stochastic Hessian-vector products. Our experiments on multiple benchmark datasets demonstrate that SPAN outperforms existing first-order and second-order optimization methods in terms of the convergence wall-clock time. Furthermore, we provide a theoretical analysis of the per-iteration complexity, the approximation error, and the convergence rate. Both the theoretical analysis and experimental results show that our proposed method achieves a better trade-off between the convergence rate and the per-iteration efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes