Perturbed Iterate Analysis for Asynchronous Stochastic Optimization
This work addresses the challenge of efficient parallel optimization for machine learning practitioners, offering a unified analysis method that is incremental but provides practical improvements.
The paper tackles the problem of analyzing asynchronous stochastic optimization algorithms by introducing a perturbed iterate framework, which simplifies analyses, removes assumptions, and yields improved convergence rates, with experimental results showing up to four orders of magnitude speedup in some cases.
We introduce and analyze stochastic optimization methods where the input to each gradient update is perturbed by bounded noise. We show that this framework forms the basis of a unified approach to analyze asynchronous implementations of stochastic optimization algorithms.In this framework, asynchronous stochastic optimization algorithms can be thought of as serial methods operating on noisy inputs. Using our perturbed iterate framework, we provide new analyses of the Hogwild! algorithm and asynchronous stochastic coordinate descent, that are simpler than earlier analyses, remove many assumptions of previous models, and in some cases yield improved upper bounds on the convergence rates. We proceed to apply our framework to develop and analyze KroMagnon: a novel, parallel, sparse stochastic variance-reduced gradient (SVRG) algorithm. We demonstrate experimentally on a 16-core machine that the sparse and parallel version of SVRG is in some cases more than four orders of magnitude faster than the standard SVRG algorithm.