LGOCMLSep 11, 2024

Convergence of continuous-time stochastic gradient descent with applications to deep neural networks

arXiv:2409.07401v23 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the theoretical understanding of SGD convergence for researchers in optimization and machine learning, but it is incremental as it builds directly on existing results.

The paper tackles the problem of establishing convergence guarantees for continuous-time stochastic gradient descent (SGD) in learning problems, extending prior results from non-stochastic gradient descent to provide general sufficient conditions for convergence, with applications to overparametrized neural network training.

We study a continuous-time approximation of the stochastic gradient descent process for minimizing the population expected loss in learning problems. The main results establish general sufficient conditions for the convergence, extending the results of Chatterjee (2022) established for (nonstochastic) gradient descent. We show how the main result can be applied to the case of overparametrized neural network training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes