ML LGJun 24, 2023

G-TRACER: Expected Sharpness Optimization

arXiv:2306.13914v1h-index: 21

Originality Incremental advance

AI Analysis

This work addresses generalization challenges in deep learning for practitioners, but it is incremental as it builds on existing flat minima and regularization concepts.

The authors tackled the problem of improving generalization in deep learning by proposing G-TRACER, a regularization scheme that promotes flat minima, and demonstrated competitive performance on benchmark computer vision and NLP datasets, especially in low signal-to-noise ratio scenarios.

We propose a new regularization scheme for the optimization of deep learning architectures, G-TRACER ("Geometric TRACE Ratio"), which promotes generalization by seeking flat minima, and has a sound theoretical basis as an approximation to a natural-gradient descent based optimization of a generalized Bayes objective. By augmenting the loss function with a TRACER, curvature-regularized optimizers (eg SGD-TRACER and Adam-TRACER) are simple to implement as modifications to existing optimizers and don't require extensive tuning. We show that the method converges to a neighborhood (depending on the regularization strength) of a local minimum of the unregularized objective, and demonstrate competitive performance on a number of benchmark computer vision and NLP datasets, with a particular focus on challenging low signal-to-noise ratio problems.

View on arXiv PDF

Similar