CVMar 7

ACD-U: Asymmetric co-teaching with machine unlearning for robust learning with noisy labels

arXiv:2603.07166v1Has Code
Predicted impact top 88% in CV · last 90 daysOriginality Highly original
AI Analysis

This work provides a robust learning framework for deep neural networks, which is crucial for practitioners working with real-world datasets often plagued by noisy labels, by actively correcting misclassified samples.

Deep neural networks struggle with noisy labels, leading to poor generalization. This paper introduces ACD-U, an asymmetric co-teaching framework that combines a CLIP-pretrained vision Transformer with a CNN and incorporates machine unlearning. ACD-U achieves state-of-the-art performance on various noisy datasets, including CIFAR-10/100, CIFAR-N, WebVision, Clothing1M, and Red Mini-ImageNet, especially in high-noise and instance-dependent noise scenarios.

Deep neural networks are prone to memorizing incorrect labels during training, which degrades their generalizability. Although recent methods have combined sample selection with semi-supervised learning (SSL) to exploit the memorization effect -- where networks learn from clean data before noisy data -- they cannot correct selection errors once a sample is misclassified. To overcome this, we propose asymmetric co-teaching with different architectures (ACD)-U, an asymmetric co-teaching framework that uses different model architectures and incorporates machine unlearning. ACD-U addresses this limitation through two core mechanisms. First, its asymmetric co-teaching pairs a contrastive language-image pretraining (CLIP)-pretrained vision Transformer with a convolutional neural network (CNN), leveraging their complementary learning behaviors: the pretrained model provides stable predictions, whereas the CNN adapts throughout training. This asymmetry, where the vision Transformer is trained only on clean samples and the CNN is trained through SSL, effectively mitigates confirmation bias. Second, selective unlearning enables post-hoc error correction by identifying incorrectly memorized samples through loss trajectory analysis and CLIP consistency checks, and then removing their influence via Kullback--Leibler divergence-based forgetting. This approach shifts the learning paradigm from passive error avoidance to active error correction. Experiments on synthetic and real-world noisy datasets, including CIFAR-10/100, CIFAR-N, WebVision, Clothing1M, and Red Mini-ImageNet, demonstrate state-of-the-art performance, particularly in high-noise regimes and under instance-dependent noise. The code is publicly available at https://github.com/meruemon/ACD-U.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes