NA NAMar 21, 2020

Approximate Newton Methods

arXiv:1702.0812415 citationsh-index: 25

AI Analysis

For researchers and practitioners using second-order optimization methods in large-scale machine learning, this work provides a theoretical foundation that better explains observed performance.

This paper fills gaps between convergence theory and practical performance of subsampled Newton methods by proposing a unifying framework that analyzes both local and global convergence, yielding theoretical results that match real-world application performance.

Many machine learning models involve solving optimization problems. Thus, it is important to deal with a large-scale optimization problem in big data applications. Recently, subsampled Newton methods have emerged to attract much attention due to their efficiency at each iteration, rectified a weakness in the ordinary Newton method of suffering a high cost in each iteration while commanding a high convergence rate. Other efficient stochastic second order methods are also proposed. However, the convergence properties of these methods are still not well understood. There are also several important gaps between the current convergence theory and the performance in real applications. In this paper, we aim to fill these gaps. We propose a unifying framework to analyze both local and global convergence properties of second order methods. Based on this framework, we present our theoretical results which match the performance in real applications well.

View on arXiv PDF

Similar