LG CVNov 9, 2021

Label-Aware Distribution Calibration for Long-tailed Classification

Chaozheng Wang, Shuzheng Gao, Cuiyun Gao, Pengyun Wang, Wenjie Pei, Lujia Pan, Zenglin Xu

arXiv:2111.04901v112.528 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of imbalanced data in machine learning, particularly for improving performance on tail classes, though it is incremental as it builds on prior re-sampling and synthesis techniques.

The paper tackles the problem of long-tailed classification by proposing a label-aware distribution calibration method that transfers statistics from head classes to tail classes, achieving significant performance improvements over existing methods on both image and text datasets.

Real-world data usually present long-tailed distributions. Training on imbalanced data tends to render neural networks perform well on head classes while much worse on tail classes. The severe sparseness of training instances for the tail classes is the main challenge, which results in biased distribution estimation during training. Plenty of efforts have been devoted to ameliorating the challenge, including data re-sampling and synthesizing new training instances for tail classes. However, no prior research has exploited the transferable knowledge from head classes to tail classes for calibrating the distribution of tail classes. In this paper, we suppose that tail classes can be enriched by similar head classes and propose a novel distribution calibration approach named as label-Aware Distribution Calibration LADC. LADC transfers the statistics from relevant head classes to infer the distribution of tail classes. Sampling from calibrated distribution further facilitates re-balancing the classifier. Experiments on both image and text long-tailed datasets demonstrate that LADC significantly outperforms existing methods.The visualization also shows that LADC provides a more accurate distribution estimation.

View on arXiv PDF

Similar