CE LG NAMar 20, 2024

Improving the Adaptive Moment Estimation (ADAM) stochastic optimizer through an Implicit-Explicit (IMEX) time-stepping approach

Abhinab Bhattacharjee, Andrey A. Popov, Arash Sarshar, Adrian Sandu

arXiv:2403.13704v24.36 citationsh-index: 11Has Code

Originality Incremental advance

AI Analysis

This work offers an incremental improvement for machine learning practitioners by enhancing a widely used optimizer.

The authors tackled the problem of improving the Adam optimizer by showing it corresponds to an ODE and proposing higher-order IMEX time-stepping methods, resulting in a new algorithm that outperforms classical Adam on regression and classification tasks.

The Adam optimizer, often used in Machine Learning for neural network training, corresponds to an underlying ordinary differential equation (ODE) in the limit of very small learning rates. This work shows that the classical Adam algorithm is a first-order implicit-explicit (IMEX) Euler discretization of the underlying ODE. Employing the time discretization point of view, we propose new extensions of the Adam scheme obtained by using higher-order IMEX methods to solve the ODE. Based on this approach, we derive a new optimization algorithm for neural network training that performs better than classical Adam on several regression and classification problems.

View on arXiv PDF Code

Similar