OCDSLGMLSep 12, 2016

Less than a Single Pass: Stochastically Controlled Stochastic Gradient Method

arXiv:1609.03261v3103 citations
AI Analysis

This work addresses efficiency challenges in machine learning optimization for practitioners dealing with large datasets, though it appears incremental as it builds on the SVRG family.

The paper tackles the problem of reducing computational and communication costs in gradient-based optimization by introducing the stochastically controlled stochastic gradient (SCSG) method, a member of the SVRG family, which uses gradient estimates at two scales controlled by a geometric random variable, achieving costs independent of sample size for low target accuracy.

We develop and analyze a procedure for gradient-based optimization that we refer to as stochastically controlled stochastic gradient (SCSG). As a member of the SVRG family of algorithms, SCSG makes use of gradient estimates at two scales, with the number of updates at the faster scale being governed by a geometric random variable. Unlike most existing algorithms in this family, both the computation cost and the communication cost of SCSG do not necessarily scale linearly with the sample size $n$; indeed, these costs are independent of $n$ when the target accuracy is low. An experimental evaluation on real datasets confirms the effectiveness of SCSG.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes