LGOCMLJul 14, 2018

On the Acceleration of L-BFGS with Second-Order Information and Stochastic Batches

arXiv:1807.05328v13 citations
Originality Incremental advance
AI Analysis

This incremental improvement addresses optimization efficiency for machine learning practitioners dealing with large-scale problems like least-square and cross-entropy losses.

The paper tackled the instability of L-BFGS with stochastic batches in finite-sum minimization by using smooth gradient difference estimates and well-scaled initial Hessians, achieving acceleration as supported by numerical experiments.

This paper proposes a framework of L-BFGS based on the (approximate) second-order information with stochastic batches, as a novel approach to the finite-sum minimization problems. Different from the classical L-BFGS where stochastic batches lead to instability, we use a smooth estimate for the evaluations of the gradient differences while achieving acceleration by well-scaling the initial Hessians. We provide theoretical analyses for both convex and nonconvex cases. In addition, we demonstrate that within the popular applications of least-square and cross-entropy losses, the algorithm admits a simple implementation in the distributed environment. Numerical experiments support the efficiency of our algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes