LGMLNov 19, 2015

Stochastic modified equations and adaptive stochastic gradient algorithms

arXiv:1511.06251v3328 citations
Originality Highly original
AI Analysis

This provides a general methodology for analyzing and designing stochastic gradient algorithms, which is foundational for machine learning optimization.

The authors tackled the problem of analyzing and designing stochastic gradient algorithms by developing stochastic modified equations (SME) to approximate them with continuous-time stochastic differential equations, and used this with optimal control theory to derive adaptive hyper-parameter policies that achieve competitive performance while being robust across models and datasets.

We develop the method of stochastic modified equations (SME), in which stochastic gradient algorithms are approximated in the weak sense by continuous-time stochastic differential equations. We exploit the continuous formulation together with optimal control theory to derive novel adaptive hyper-parameter adjustment policies. Our algorithms have competitive performance with the added benefit of being robust to varying models and datasets. This provides a general methodology for the analysis and design of stochastic gradient algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes