Adam-SHANG: A Convergent Adam-Type Method for Stochastic Smooth Convex Optimization
Provides a theoretically convergent variant of Adam for convex optimization, addressing a known gap in adaptive methods, though the practical impact is limited to convex settings and the non-convex experiments are preliminary.
Adam-SHANG achieves provable convergence for stochastic smooth convex optimization by coupling momentum, adaptive preconditioning, and a curvature-aware correction, with experiments showing competitive performance against Adam and AdamW on deep learning tasks.
We propose Adam-SHANG, a Lyapunov-guided Adam-type method that couples momentum, adaptive preconditioning, and a curvature-aware correction through a more stable lagged-preconditioner update. For stochastic smooth convex optimization, we prove convergence in expectation under an admissible stepsize condition that can always be satisfied by a conservative spectral bound, without imposing global monotonicity on the second-moment sequence. To obtain a less conservative practical rule, we introduce a computable trace-ratio stepsize, motivated by a local coordinatewise alignment condition. The same structural update is also tested beyond the convex setting with simplified parameters. Experiments validate the predicted stochastic decay and show competitive training performance against Adam and AdamW on deep learning tasks.