LG AIMay 25, 2023

SING: A Plug-and-Play DNN Learning Technique

Adrien Courtois, Damien Scieur, Jean-Michel Morel, Pablo Arias, Thomas Eboli

arXiv:2305.15997v12.0Has Code

Originality Incremental advance

AI Analysis

This incremental improvement addresses optimization stability for deep learning practitioners, enhancing training efficiency and generalization in neural networks.

The authors tackled the instability and poor generalization of the Adam(W) optimizer by proposing SING, a plug-and-play technique that standardizes gradients layer-wise, resulting in improved performance across various architectures and tasks like image classification and NLP without extra hyperparameters.

We propose SING (StabIlized and Normalized Gradient), a plug-and-play technique that improves the stability and generalization of the Adam(W) optimizer. SING is straightforward to implement and has minimal computational overhead, requiring only a layer-wise standardization of the gradients fed to Adam(W) without introducing additional hyper-parameters. We support the effectiveness and practicality of the proposed approach by showing improved results on a wide range of architectures, problems (such as image classification, depth estimation, and natural language processing), and in combination with other optimizers. We provide a theoretical analysis of the convergence of the method, and we show that by virtue of the standardization, SING can escape local minima narrower than a threshold that is inversely proportional to the network's depth.

View on arXiv PDF Code

Similar