MLAug 12, 2015

Convergence rates of sub-sampled Newton methods

arXiv:1508.02810v2167 citations
AI Analysis

This provides an efficient optimization method for large-scale machine learning problems, though it is incremental as it builds on existing sub-sampling techniques.

The paper tackles the problem of minimizing a sum of many functions in high-dimensional settings by proposing a new randomized batch algorithm that combines sub-sampling with low-rank approximation, achieving a composite convergence rate (quadratic then linear) comparable to Newton's method with lower per-iteration cost.

We consider the problem of minimizing a sum of $n$ functions over a convex parameter set $\mathcal{C} \subset \mathbb{R}^p$ where $n\gg p\gg 1$. In this regime, algorithms which utilize sub-sampling techniques are known to be effective. In this paper, we use sub-sampling techniques together with low-rank approximation to design a new randomized batch algorithm which possesses comparable convergence rate to Newton's method, yet has much smaller per-iteration cost. The proposed algorithm is robust in terms of starting point and step size, and enjoys a composite convergence rate, namely, quadratic convergence at start and linear convergence when the iterate is close to the minimizer. We develop its theoretical analysis which also allows us to select near-optimal algorithm parameters. Our theoretical results can be used to obtain convergence rates of previously proposed sub-sampling based algorithms as well. We demonstrate how our results apply to well-known machine learning problems. Lastly, we evaluate the performance of our algorithm on several datasets under various scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes