Balanced Contrastive Learning for Long-Tailed Visual Recognition
This addresses the challenge of improving visual recognition for minority classes in long-tailed datasets, which is common in real-world applications, but the approach is incremental as it builds on existing supervised contrastive learning methods.
The paper tackles the problem of representation learning for long-tailed visual data, where models struggle with minority classes, by proposing a balanced contrastive learning (BCL) method that corrects optimization issues in supervised contrastive learning and achieves competitive performance on benchmark datasets like CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist2018.
Real-world data typically follow a long-tailed distribution, where a few majority categories occupy most of the data while most minority categories contain a limited number of samples. Classification models minimizing cross-entropy struggle to represent and classify the tail classes. Although the problem of learning unbiased classifiers has been well studied, methods for representing imbalanced data are under-explored. In this paper, we focus on representation learning for imbalanced data. Recently, supervised contrastive learning has shown promising performance on balanced data recently. However, through our theoretical analysis, we find that for long-tailed data, it fails to form a regular simplex which is an ideal geometric configuration for representation learning. To correct the optimization behavior of SCL and further improve the performance of long-tailed visual recognition, we propose a novel loss for balanced contrastive learning (BCL). Compared with SCL, we have two improvements in BCL: class-averaging, which balances the gradient contribution of negative classes; class-complement, which allows all classes to appear in every mini-batch. The proposed balanced contrastive learning (BCL) method satisfies the condition of forming a regular simplex and assists the optimization of cross-entropy. Equipped with BCL, the proposed two-branch framework can obtain a stronger feature representation and achieve competitive performance on long-tailed benchmark datasets such as CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist2018. Our code is available at https://github.com/FlamieZhu/BCL .