CVApr 30, 2022

Reliable Label Correction is a Good Booster When Learning with Extremely Noisy Labels

Kai Wang, Xiangyu Peng, Shuo Yang, Jianfei Yang, Zheng Zhu, Xinchao Wang, Yang You

arXiv:2205.00186v28.19 citationsh-index: 67Has Code

Originality Highly original

AI Analysis

This addresses the problem of robust machine learning for practitioners dealing with heavily mislabeled datasets, representing a strong specific gain rather than a foundational breakthrough.

The paper tackles learning with extremely noisy labels by introducing LC-Booster, a framework that incorporates reliable label correction into sample selection to alleviate confirmation bias, achieving state-of-the-art results with 92.9% accuracy on CIFAR-10 and 48.4% on CIFAR-100 under 90% noise.

Learning with noisy labels has aroused much research interest since data annotations, especially for large-scale datasets, may be inevitably imperfect. Recent approaches resort to a semi-supervised learning problem by dividing training samples into clean and noisy sets. This paradigm, however, is prone to significant degeneration under heavy label noise, as the number of clean samples is too small for conventional methods to behave well. In this paper, we introduce a novel framework, termed as LC-Booster, to explicitly tackle learning under extreme noise. The core idea of LC-Booster is to incorporate label correction into the sample selection, so that more purified samples, through the reliable label correction, can be utilized for training, thereby alleviating the confirmation bias. Experiments show that LC-Booster advances state-of-the-art results on several noisy-label benchmarks, including CIFAR-10, CIFAR-100, Clothing1M and WebVision. Remarkably, under the extreme 90\% noise ratio, LC-Booster achieves 92.9\% and 48.4\% accuracy on CIFAR-10 and CIFAR-100, surpassing state-of-the-art methods by a large margin.

View on arXiv PDF Code

Similar