P-DIFF: Learning Classifier with Noisy Labels based on Probability Difference Distributions
This addresses the challenge of noisy labels in machine learning, which is a common issue in real-world datasets, and is incremental as it builds on existing sample selection methods.
The paper tackles the problem of training deep neural network classifiers with noisy labels, which can cause overfitting, by proposing P-DIFF, a simple training paradigm that uses probability difference distributions to re-weight samples and achieves superior performance compared to state-of-the-art methods on benchmark datasets.
Learning deep neural network (DNN) classifier with noisy labels is a challenging task because the DNN can easily over-fit on these noisy labels due to its high capability. In this paper, we present a very simple but effective training paradigm called P-DIFF, which can train DNN classifiers but obviously alleviate the adverse impact of noisy labels. Our proposed probability difference distribution implicitly reflects the probability of a training sample to be clean, then this probability is employed to re-weight the corresponding sample during the training process. P-DIFF can also achieve good performance even without prior knowledge on the noise rate of training samples. Experiments on benchmark datasets also demonstrate that P-DIFF is superior to the state-of-the-art sample selection methods.