CVAug 5, 2022

Neighborhood Collective Estimation for Noisy Label Identification and Correction

Jichang Li, Guanbin Li, Feng Liu, Yizhou Yu

arXiv:2208.03207v115.349 citationsh-index: 71Has Code

Originality Incremental advance

AI Analysis

This addresses the issue of model overfitting to noisy labels in machine learning, offering a novel approach to improve generalization, though it is incremental as it builds on existing noise verification and correction methods.

The paper tackles the problem of learning with noisy labels by proposing Neighborhood Collective Estimation, which uses feature-space nearest neighbors to identify and correct noisy labels, achieving state-of-the-art performance on benchmark datasets like CIFAR-10, CIFAR-100, Clothing-1M, and Webvision-1.0.

Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels. The key success of LNL lies in identifying as many clean samples as possible from massive noisy data, while rectifying the wrongly assigned noisy labels. Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias. To mitigate this issue, we propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors. Specifically, our method is divided into two steps: 1) Neighborhood Collective Noise Verification to separate all training samples into a clean or noisy subset, 2) Neighborhood Collective Label Correction to relabel noisy samples, and then auxiliary techniques are used to assist further model optimization. Extensive experiments on four commonly used benchmark datasets, i.e., CIFAR-10, CIFAR-100, Clothing-1M and Webvision-1.0, demonstrate that our proposed method considerably outperforms state-of-the-art methods.

View on arXiv PDF Code

Similar