LGMLJan 31, 2016

A Proximal Stochastic Quasi-Newton Algorithm

arXiv:1602.00223v26 citations
AI Analysis

This addresses optimization challenges in large-scale machine learning, but it appears incremental as it builds on existing proximal and stochastic methods.

The paper tackles the problem of minimizing a sum of smooth and non-smooth convex functions, where the smooth part is an average of many components, by proposing a proximal stochastic second-order method that achieves linear convergence.

In this paper, we discuss the problem of minimizing the sum of two convex functions: a smooth function plus a non-smooth function. Further, the smooth part can be expressed by the average of a large number of smooth component functions, and the non-smooth part is equipped with a simple proximal mapping. We propose a proximal stochastic second-order method, which is efficient and scalable. It incorporates the Hessian in the smooth part of the function and exploits multistage scheme to reduce the variance of the stochastic gradient. We prove that our method can achieve linear rate of convergence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes