Skeptical Deep Learning with Distribution Correction
This addresses the challenge of costly large-scale credible labels in real-world applications for machine learning practitioners, though it appears incremental as it builds on existing robust learning methods.
The paper tackles the problem of deep neural networks overfitting to imperfectly labeled training data by developing a distribution correction approach that treats noisy input as samples from an incorrect distribution and corrects it during training. The results show significantly higher prediction and recovery accuracy on classification datasets with noisy labels compared to alternative methods.
Recently deep neural networks have been successfully used for various classification tasks, especially for problems with massive perfectly labeled training data. However, it is often costly to have large-scale credible labels in real-world applications. One solution is to make supervised learning robust with imperfectly labeled input. In this paper, we develop a distribution correction approach that allows deep neural networks to avoid overfitting imperfect training data. Specifically, we treat the noisy input as samples from an incorrect distribution, which will be automatically corrected during our training process. We test our approach on several classification datasets with elaborately generated noisy labels. The results show significantly higher prediction and recovery accuracy with our approach compared to alternative methods.