LGHCJul 23, 2021

An Instance-Dependent Simulation Framework for Learning with Label Noise

arXiv:2107.11413v429 citations
Originality Incremental advance
AI Analysis

This work addresses label noise in machine learning, which is a common issue in real-world datasets, but it is incremental as it builds on existing noisy label techniques.

The authors tackled the problem of learning with instance-dependent label noise by proposing a simulation framework that generates synthetic noisy labels closer to human labels than existing methods, and introduced a Label Quality Model (LQM) that improves model performance when combined with existing techniques.

We propose a simulation framework for generating instance-dependent noisy labels via a pseudo-labeling paradigm. We show that the distribution of the synthetic noisy labels generated with our framework is closer to human labels compared to independent and class-conditional random flipping. Equipped with controllable label noise, we study the negative impact of noisy labels across a few practical settings to understand when label noise is more problematic. We also benchmark several existing algorithms for learning with noisy labels and compare their behavior on our synthetic datasets and on the datasets with independent random label noise. Additionally, with the availability of annotator information from our simulation framework, we propose a new technique, Label Quality Model (LQM), that leverages annotator features to predict and correct against noisy labels. We show that by adding LQM as a label correction step before applying existing noisy label techniques, we can further improve the models' performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes