Probabilistic End-to-end Noise Correction for Learning with Noisy Labels
This addresses the challenge of training deep learning models with noisy labels, a common issue in data collection, and is incremental as it builds on existing methods but offers improved generality and robustness.
The paper tackles the problem of learning with noisy labels, which causes overfitting and accuracy drops in deep learning, by proposing PENCIL, an end-to-end framework that updates network parameters and label distributions, achieving large performance margins over previous state-of-the-art methods on synthetic and real-world datasets.
Deep learning has achieved excellent performance in various computer vision tasks, but requires a lot of training examples with clean labels. It is easy to collect a dataset with noisy labels, but such noise makes networks overfit seriously and accuracies drop dramatically. To address this problem, we propose an end-to-end framework called PENCIL, which can update both network parameters and label estimations as label distributions. PENCIL is independent of the backbone network structure and does not need an auxiliary clean dataset or prior information about noise, thus it is more general and robust than existing methods and is easy to apply. PENCIL outperforms previous state-of-the-art methods by large margins on both synthetic and real-world datasets with different noise types and noise rates. Experiments show that PENCIL is robust on clean datasets, too.