LG OC MLJul 14, 2018

On the Acceleration of L-BFGS with Second-Order Information and Stochastic Batches

Jie Liu, Yu Rong, Martin Takac, Junzhou Huang

arXiv:1807.05328v12.93 citations

Originality Incremental advance

AI Analysis

This incremental improvement addresses optimization efficiency for machine learning practitioners dealing with large-scale problems like least-square and cross-entropy losses.

The paper tackled the instability of L-BFGS with stochastic batches in finite-sum minimization by using smooth gradient difference estimates and well-scaled initial Hessians, achieving acceleration as supported by numerical experiments.

This paper proposes a framework of L-BFGS based on the (approximate) second-order information with stochastic batches, as a novel approach to the finite-sum minimization problems. Different from the classical L-BFGS where stochastic batches lead to instability, we use a smooth estimate for the evaluations of the gradient differences while achieving acceleration by well-scaling the initial Hessians. We provide theoretical analyses for both convex and nonconvex cases. In addition, we demonstrate that within the popular applications of least-square and cross-entropy losses, the algorithm admits a simple implementation in the distributed environment. Numerical experiments support the efficiency of our algorithms.

View on arXiv PDF

Similar