LG MLMay 13, 2019

Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels

Pengfei Chen, Benben Liao, Guangyong Chen, Shengyu Zhang

arXiv:1905.05040v134.1452 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of robust training for deep neural networks in real-world datasets with noisy labels, offering an incremental improvement over existing methods.

The paper tackles the problem of training deep neural networks with noisy labels by analyzing how test accuracy relates to noise ratio and proposing a method combining cross-validation and Co-teaching to identify correct labels. The strategy consistently improves generalization performance under synthetic and real-world noise compared to state-of-the-art methods.

Noisy labels are ubiquitous in real-world datasets, which poses a challenge for robustly training deep neural networks (DNNs) as DNNs usually have the high capacity to memorize the noisy labels. In this paper, we find that the test accuracy can be quantitatively characterized in terms of the noise ratio in datasets. In particular, the test accuracy is a quadratic function of the noise ratio in the case of symmetric noise, which explains the experimental findings previously published. Based on our analysis, we apply cross-validation to randomly split noisy datasets, which identifies most samples that have correct labels. Then we adopt the Co-teaching strategy which takes full advantage of the identified samples to train DNNs robustly against noisy labels. Compared with extensive state-of-the-art methods, our strategy consistently improves the generalization performance of DNNs under both synthetic and real-world training noise.

View on arXiv PDF Code

Similar