LG CV AP MLMar 13, 2021

Learning with Feature-Dependent Label Noise: A Progressive Approach

Yikai Zhang, Songzhu Zheng, Pengxiang Wu, Mayank Goswami, Chao Chen

arXiv:2103.07756v330.4188 citationsHas Code

Originality Highly original

AI Analysis

This addresses a critical issue in machine learning for real-world applications where label noise is heterogeneous and feature-dependent, representing a novel method for a known bottleneck.

The paper tackles the problem of feature-dependent label noise in large-scale datasets by proposing a progressive label correction algorithm, which theoretically converges to the Bayes classifier and outperforms state-of-the-art baselines in experiments.

Label noise is frequently observed in real-world large-scale datasets. The noise is introduced due to a variety of reasons; it is heterogeneous and feature-dependent. Most existing approaches to handling noisy labels fall into two categories: they either assume an ideal feature-independent noise, or remain heuristic without theoretical guarantees. In this paper, we propose to target a new family of feature-dependent label noise, which is much more general than commonly used i.i.d. label noise and encompasses a broad spectrum of noise patterns. Focusing on this general noise family, we propose a progressive label correction algorithm that iteratively corrects labels and refines the model. We provide theoretical guarantees showing that for a wide variety of (unknown) noise patterns, a classifier trained with this strategy converges to be consistent with the Bayes classifier. In experiments, our method outperforms SOTA baselines and is robust to various noise types and levels.

View on arXiv PDF Code

Similar