OC LG MLJan 28, 2019

Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample

Albert S. Berahas, Majid Jahani, Peter Richtárik, Martin Takáč

arXiv:1901.09997v523.051 citationsh-index: 47Has Code

Originality Incremental advance

AI Analysis

This work addresses optimization efficiency for machine learning practitioners by offering incremental improvements to quasi-Newton methods with better concurrency and data usage.

The paper tackles the problem of stale information in classical quasi-Newton methods for machine learning by proposing sampled variants (LBFGS and LSR1) that use random local sampling instead of sequential past data, resulting in improved performance on classification and neural network tasks.

We present two sampled quasi-Newton methods (sampled LBFGS and sampled LSR1) for solving empirical risk minimization problems that arise in machine learning. Contrary to the classical variants of these methods that sequentially build Hessian or inverse Hessian approximations as the optimization progresses, our proposed methods sample points randomly around the current iterate at every iteration to produce these approximations. As a result, the approximations constructed make use of more reliable (recent and local) information, and do not depend on past iterate information that could be significantly stale. Our proposed algorithms are efficient in terms of accessed data points (epochs) and have enough concurrency to take advantage of parallel/distributed computing environments. We provide convergence guarantees for our proposed methods. Numerical tests on a toy classification problem as well as on popular benchmarking binary classification and neural network training tasks reveal that the methods outperform their classical variants.

View on arXiv PDF Code

Similar