OCLGNov 9, 2022

Perturbed Iterate SGD for Lipschitz Continuous Loss Functions with Numerical Error and Adaptive Step Sizes

arXiv:2211.04655v82 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses training neural networks in finite-precision arithmetic, an incremental improvement for robustness in practical implementations.

The paper tackled the convergence of perturbed iterate SGD with adaptive step sizes under numerical errors, proving asymptotic convergence to a Clarke stationary point and non-asymptotic convergence to an approximate stationary point in expectation for Lipschitz continuous loss functions.

Motivated by neural network training in finite-precision arithmetic environments, this work studies the convergence of perturbed iterate SGD using adaptive step sizes in an environment with numerical error. Considering a general stochastic Lipschitz continuous loss function, an asymptotic convergence result to a Clarke stationary point is proven as well as the non-asymptotic convergence to an approximate stationary point in expectation. It is assumed that only an approximation of the loss function's stochastic gradient can be computed, in addition to error in computing the SGD step itself.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes