LGAIHCMLMay 28, 2021

Rethinking Noisy Label Models: Labeler-Dependent Noise with Adversarial Awareness

arXiv:2105.14083v24 citations
Originality Highly original
AI Analysis

This work addresses the challenge of noisy labels in machine learning, particularly for datasets annotated via crowdsourcing, by introducing a more realistic noise model that improves robustness against adversarial attacks.

The authors tackled the problem of learning from noisy labels by proposing a labeler-dependent noise model that accounts for both natural errors and adversarial attacks, demonstrating that existing state-of-the-art methods fail under this model and presenting a robust framework that effectively filters noisy labels.

Most studies on learning from noisy labels rely on unrealistic models of i.i.d. label noise, such as class-conditional transition matrices. More recent work on instance-dependent noise models are more realistic, but assume a single generative process for label noise across the entire dataset. We propose a more principled model of label noise that generalizes instance-dependent noise to multiple labelers, based on the observation that modern datasets are typically annotated using distributed crowdsourcing methods. Under our labeler-dependent model, label noise manifests itself under two modalities: natural error of good-faith labelers, and adversarial labels provided by malicious actors. We present two adversarial attack vectors that more accurately reflect the label noise that may be encountered in real-world settings, and demonstrate that under our multimodal noisy labels model, state-of-the-art approaches for learning from noisy labels are defeated by adversarial label attacks. Finally, we propose a multi-stage, labeler-aware, model-agnostic framework that reliably filters noisy labels by leveraging knowledge about which data partitions were labeled by which labeler, and show that our proposed framework remains robust even in the presence of extreme adversarial label noise.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes