CV LGDec 29, 2021

Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification

Beier Zhu, Yulei Niu, Xian-Sheng Hua, Hanwang Zhang

arXiv:2112.14380v114.050 citationsHas Code

Originality Highly original

AI Analysis

This addresses the overlooked issue of unbiasedness in long-tailed classification for real-world scenarios where test data may be imbalanced, offering a solution that improves performance across different distributions.

The paper tackles the problem of biased performance in long-tailed classification when test distributions are imbalanced, proposing Cross-Domain Empirical Risk Minimization (xERM) to achieve unbiased strong performances on both balanced and imbalanced test sets by learning better feature representations.

We address the overlooked unbiasedness in existing long-tailed classification methods: we find that their overall improvement is mostly attributed to the biased preference of tail over head, as the test distribution is assumed to be balanced; however, when the test is as imbalanced as the long-tailed training data -- let the test respect Zipf's law of nature -- the tail bias is no longer beneficial overall because it hurts the head majorities. In this paper, we propose Cross-Domain Empirical Risk Minimization (xERM) for training an unbiased model to achieve strong performances on both test distributions, which empirically demonstrates that xERM fundamentally improves the classification by learning better feature representation rather than the head vs. tail game. Based on causality, we further theoretically explain why xERM achieves unbiasedness: the bias caused by the domain selection is removed by adjusting the empirical risks on the imbalanced domain and the balanced but unseen domain. Codes are available at https://github.com/BeierZhu/xERM.

View on arXiv PDF Code

Similar