LGAug 6, 2021

Unsupervised Learning of Debiased Representations with Pseudo-Attributes

Seonguk Seo, Joon-Young Lee, Bohyung Han

arXiv:2108.02943v220.462 citations

Originality Highly original

AI Analysis

This addresses dataset bias in machine learning, which can degrade model fairness and generalization, offering an unsupervised alternative to annotation-dependent approaches.

The paper tackles dataset bias by proposing an unsupervised method that identifies pseudo-attributes via clustering and uses cluster-wise reweighting to learn debiased representations, achieving competitive accuracy with supervised methods on standard benchmarks.

Dataset bias is a critical challenge in machine learning since it often leads to a negative impact on a model due to the unintended decision rules captured by spurious correlations. Although existing works often handle this issue based on human supervision, the availability of the proper annotations is impractical and even unrealistic. To better tackle the limitation, we propose a simple but effective unsupervised debiasing technique. Specifically, we first identify pseudo-attributes based on the results from clustering performed in the feature embedding space even without an explicit bias attribute supervision. Then, we employ a novel cluster-wise reweighting scheme to learn debiased representation; the proposed method prevents minority groups from being discounted for minimizing the overall loss, which is desirable for worst-case generalization. The extensive experiments demonstrate the outstanding performance of our approach on multiple standard benchmarks, even achieving the competitive accuracy to the supervised counterpart.

View on arXiv PDF

Similar