Learning Debiased Classifier with Biased Committee
This addresses the issue of bias in classifiers for machine learning practitioners, offering an incremental improvement over existing debiasing techniques without requiring spurious attribute labels.
The paper tackles the problem of neural networks being biased towards spurious correlations in training data, which harms generalization, by proposing a method that uses a biased committee to identify and weight bias-conflicting data for training a debiased classifier, resulting in outperformance on five real-world datasets compared to prior methods.
Neural networks are prone to be biased towards spurious correlations between classes and latent attributes exhibited in a major portion of training data, which ruins their generalization capability. We propose a new method for training debiased classifiers with no spurious attribute label. The key idea is to employ a committee of classifiers as an auxiliary module that identifies bias-conflicting data, i.e., data without spurious correlation, and assigns large weights to them when training the main classifier. The committee is learned as a bootstrapped ensemble so that a majority of its classifiers are biased as well as being diverse, and intentionally fail to predict classes of bias-conflicting data accordingly. The consensus within the committee on prediction difficulty thus provides a reliable cue for identifying and weighting bias-conflicting data. Moreover, the committee is also trained with knowledge transferred from the main classifier so that it gradually becomes debiased along with the main classifier and emphasizes more difficult data as training progresses. On five real-world datasets, our method outperforms prior arts using no spurious attribute label like ours and even surpasses those relying on bias labels occasionally.