LGMar 22, 2023

Fairness Improves Learning from Noisily Labeled Long-Tailed Data

arXiv:2303.12291v19 citationsh-index: 86
Originality Incremental advance
AI Analysis

This addresses a real-world problem for machine learning practitioners dealing with imbalanced and noisy data, offering an incremental improvement by integrating fairness considerations into existing solutions.

The paper tackles the combined challenge of learning from datasets that are both long-tailed and noisily labeled, where prior methods fail to consistently improve performance or benefit all sub-populations. It introduces a Fairness Regularizer (FR) that improves tail sub-population performances and overall learning when combined with existing robust or class-balanced methods.

Both long-tailed and noisily labeled data frequently appear in real-world applications and impose significant challenges for learning. Most prior works treat either problem in an isolated way and do not explicitly consider the coupling effects of the two. Our empirical observation reveals that such solutions fail to consistently improve the learning when the dataset is long-tailed with label noise. Moreover, with the presence of label noise, existing methods do not observe universal improvements across different sub-populations; in other words, some sub-populations enjoyed the benefits of improved accuracy at the cost of hurting others. Based on these observations, we introduce the Fairness Regularizer (FR), inspired by regularizing the performance gap between any two sub-populations. We show that the introduced fairness regularizer improves the performances of sub-populations on the tail and the overall learning performance. Extensive experiments demonstrate the effectiveness of the proposed solution when complemented with certain existing popular robust or class-balanced methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes