LGAug 1, 2023

Divergence of the ADAM algorithm with fixed-stepsize: a (very) simple example

arXiv:2308.00720v12 citationsh-index: 59
Originality Synthesis-oriented
AI Analysis

This work identifies a fundamental flaw in ADAM's convergence guarantees for practitioners in optimization and machine learning, showing it is incremental by providing a counterexample to existing assumptions.

The authors constructed a simple unidimensional function with Lipschitz continuous gradient to demonstrate that the ADAM algorithm with constant stepsize diverges when minimizing it from the origin without gradient noise, regardless of parameter choices.

A very simple unidimensional function with Lipschitz continuous gradient is constructed such that the ADAM algorithm with constant stepsize, started from the origin, diverges when applied to minimize this function in the absence of noise on the gradient. Divergence occurs irrespective of the choice of the method parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes