CVJan 30, 2024

Reviving Undersampling for Long-Tailed Learning

arXiv:2401.16811v18 citationsh-index: 3Has CodePattern Recognition
Originality Incremental advance
AI Analysis

This work addresses the issue of ignoring worst-performing categories in long-tailed learning for machine learning practitioners, though it is incremental as it revives an existing idea with modifications.

The paper tackles the problem of unbalanced training datasets in long-tailed recognition by reviving balanced undersampling to enhance accuracy for worst-performing categories, achieving higher harmonic and geometric mean accuracy with a simple ensemble strategy that maintains average accuracy compared to state-of-the-art methods.

The training datasets used in long-tailed recognition are extremely unbalanced, resulting in significant variation in per-class accuracy across categories. Prior works mostly used average accuracy to evaluate their algorithms, which easily ignores those worst-performing categories. In this paper, we aim to enhance the accuracy of the worst-performing categories and utilize the harmonic mean and geometric mean to assess the model's performance. We revive the balanced undersampling idea to achieve this goal. In few-shot learning, balanced subsets are few-shot and will surely under-fit, hence it is not used in modern long-tailed learning. But, we find that it produces a more equitable distribution of accuracy across categories with much higher harmonic and geometric mean accuracy, and, but lower average accuracy. Moreover, we devise a straightforward model ensemble strategy, which does not result in any additional overhead and achieves improved harmonic and geometric mean while keeping the average accuracy almost intact when compared to state-of-the-art long-tailed learning methods. We validate the effectiveness of our approach on widely utilized benchmark datasets for long-tailed learning. Our code is at \href{https://github.com/yuhao318/BTM/}{https://github.com/yuhao318/BTM/}.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes