MLLGMay 21, 2018

Stochastic modified equations for the asynchronous stochastic gradient descent

arXiv:1805.08244v311 citations
Originality Incremental advance
AI Analysis

This work provides a theoretical framework for understanding and optimizing ASGD, which is incremental as it builds on existing SME methods for stochastic gradient algorithms.

The authors tackled the problem of modeling asynchronous stochastic gradient descent (ASGD) by proposing a stochastic modified equation (SME) of Langevin type, which elucidates the dynamics and relationships between different algorithms, and applied it to derive an optimal mini-batching strategy for ASGD.

We propose a stochastic modified equations (SME) for modeling the asynchronous stochastic gradient descent (ASGD) algorithms. The resulting SME of Langevin type extracts more information about the ASGD dynamics and elucidates the relationship between different types of stochastic gradient algorithms. We show the convergence of ASGD to the SME in the continuous time limit, as well as the SME's precise prediction to the trajectories of ASGD with various forcing terms. As an application of the SME, we propose an optimal mini-batching strategy for ASGD via solving the optimal control problem of the associated SME.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes