Noisy Early Stopping for Noisy Labels
This addresses the challenge of costly clean validation sets for practitioners in noisy label environments, though it is incremental as it builds on existing early stopping methods.
The paper tackles the problem of overfitting when training neural networks on datasets with noisy labels by proposing Noisy Early Stopping (NES), which uses a noisy validation set instead of a clean one, achieving near-optimal results and reducing implementation costs.
Training neural network classifiers on datasets contaminated with noisy labels significantly increases the risk of overfitting. Thus, effectively implementing Early Stopping in noisy label environments is crucial. Under ideal circumstances, Early Stopping utilises a validation set uncorrupted by label noise to effectively monitor generalisation during training. However, obtaining a noise-free validation dataset can be costly and challenging to obtain. This study establishes that, in many typical learning environments, a noise-free validation set is not necessary for effective Early Stopping. Instead, near-optimal results can be achieved by monitoring accuracy on a noisy dataset - drawn from the same distribution as the noisy training set. Referred to as `Noisy Early Stopping' (NES), this method simplifies and reduces the cost of implementing Early Stopping. We provide theoretical insights into the conditions under which this method is effective and empirically demonstrate its robust performance across standard benchmarks using common loss functions.