Do We Really Need Gold Samples for Sample Weighting Under Label Noise?
This addresses the challenge of label noise in machine learning for practitioners who lack access to clean validation data, offering a practical solution without requiring gold samples.
The paper tackles the problem of training deep neural networks with noisy labels by proposing a method to train sample weighting networks without needing clean samples, achieving performance comparable to methods that use clean samples on benchmark datasets.
Learning with labels noise has gained significant traction recently due to the sensitivity of deep neural networks under label noise under common loss functions. Losses that are theoretically robust to label noise, however, often makes training difficult. Consequently, several recently proposed methods, such as Meta-Weight-Net (MW-Net), use a small number of unbiased, clean samples to learn a weighting function that downweights samples that are likely to have corrupted labels under the meta-learning framework. However, obtaining such a set of clean samples is not always feasible in practice. In this paper, we analytically show that one can easily train MW-Net without access to clean samples simply by using a loss function that is robust to label noise, such as mean absolute error, as the meta objective to train the weighting network. We experimentally show that our method beats all existing methods that do not use clean samples and performs on-par with methods that use gold samples on benchmark datasets across various noise types and noise rates.