LG AISep 2, 2023

Regularly Truncated M-estimators for Learning with Noisy Labels

Xiaobo Xia, Pengqian Lu, Chen Gong, Bo Han, Jun Yu, Jun Yu, Tongliang Liu

arXiv:2309.00894v111.519 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of noisy labels in machine learning, which is crucial for improving model reliability in real-world applications, but it is incremental as it builds on existing sample selection approaches.

The paper tackles the problem of learning with noisy labels by proposing regularly truncated M-estimators (RTME) to address limitations in sample selection methods, such as ignoring noisy labels in selected examples and discarding useful large-loss examples, and shows that RTME outperforms multiple baselines with robustness to various noise types and levels.

The sample selection approach is very popular in learning with noisy labels. As deep networks learn pattern first, prior methods built on sample selection share a similar training procedure: the small-loss examples can be regarded as clean examples and used for helping generalization, while the large-loss examples are treated as mislabeled ones and excluded from network parameter updates. However, such a procedure is arguably debatable from two folds: (a) it does not consider the bad influence of noisy labels in selected small-loss examples; (b) it does not make good use of the discarded large-loss examples, which may be clean or have meaningful information for generalization. In this paper, we propose regularly truncated M-estimators (RTME) to address the above two issues simultaneously. Specifically, RTME can alternately switch modes between truncated M-estimators and original M-estimators. The former can adaptively select small-losses examples without knowing the noise rate and reduce the side-effects of noisy labels in them. The latter makes the possibly clean examples but with large losses involved to help generalization. Theoretically, we demonstrate that our strategies are label-noise-tolerant. Empirically, comprehensive experimental results show that our method can outperform multiple baselines and is robust to broad noise types and levels.

View on arXiv PDF Code

Similar