LG AI CV MLApr 24, 2025

OUI Need to Talk About Weight Decay: A New Perspective on Overfitting Detection

Alberto Fernández-Hernández, Jose I. Mestre, Manuel F. Dolz, Jose Duato, Enrique S. Quintana-Ortí

arXiv:2504.17160v115.77 citationsh-index: 16Has Code2025 International Conference on Advanced Machine Learning and Data Science (AMLDS)

Originality Incremental advance

AI Analysis

This provides a tool for practitioners to tune regularization more efficiently, but it is incremental as it builds on existing hyperparameter optimization methods.

The paper tackles the problem of selecting optimal weight decay hyperparameters in deep neural networks by introducing the Overfitting-Underfitting Indicator (OUI), which monitors training dynamics without validation data, resulting in faster convergence and improved generalization scores on datasets like CIFAR-100 and ImageNet-1K.

We introduce the Overfitting-Underfitting Indicator (OUI), a novel tool for monitoring the training dynamics of Deep Neural Networks (DNNs) and identifying optimal regularization hyperparameters. Specifically, we validate that OUI can effectively guide the selection of the Weight Decay (WD) hyperparameter by indicating whether a model is overfitting or underfitting during training without requiring validation data. Through experiments on DenseNet-BC-100 with CIFAR- 100, EfficientNet-B0 with TinyImageNet and ResNet-34 with ImageNet-1K, we show that maintaining OUI within a prescribed interval correlates strongly with improved generalization and validation scores. Notably, OUI converges significantly faster than traditional metrics such as loss or accuracy, enabling practitioners to identify optimal WD (hyperparameter) values within the early stages of training. By leveraging OUI as a reliable indicator, we can determine early in training whether the chosen WD value leads the model to underfit the training data, overfit, or strike a well-balanced trade-off that maximizes validation scores. This enables more precise WD tuning for optimal performance on the tested datasets and DNNs. All code for reproducing these experiments is available at https://github.com/AlbertoFdezHdez/OUI.

View on arXiv PDF Code

Similar