OCLGMLNov 15, 2016

Oracle Complexity of Second-Order Methods for Finite-Sum Problems

arXiv:1611.04982v225 citations
Originality Synthesis-oriented
AI Analysis

This addresses a fundamental problem in machine learning optimization for researchers and practitioners, showing that second-order methods may not offer efficiency gains as previously hoped, which is incremental but clarifies theoretical limits.

The paper tackles the question of whether second-order methods can solve finite-sum optimization problems more efficiently than first-order methods, finding that, in terms of worst-case guarantees, the answer is negative, though it suggests potential workarounds with additional assumptions.

Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in \emph{second-order} methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer -- perhaps surprisingly -- is negative, at least in terms of worst-case guarantees. However, we also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes