CVMay 29, 2022

Revisiting the Importance of Amplifying Bias for Debiasing

Jungsoo Lee, Jeonghoon Park, Daeyoung Kim, Juyoung Lee, Edward Choi, Jaegul Choo

arXiv:2205.14594v414.530 citationsh-index: 44

Originality Incremental advance

AI Analysis

This work addresses dataset bias in image classification, which can lead to unfair or inaccurate models, but it is incremental as it builds on existing debiasing frameworks.

The paper tackles the problem of debiasing in image classification by focusing on improving the biased model component, revealing that removing bias-conflicting samples from its training set enhances debiasing performance. The proposed data sample selection method boosts existing approaches, achieving state-of-the-art results on synthetic and real-world datasets.

In image classification, "debiasing" aims to train a classifier to be less susceptible to dataset bias, the strong correlation between peripheral attributes of data samples and a target class. For example, even if the frog class in the dataset mainly consists of frog images with a swamp background (i.e., bias-aligned samples), a debiased classifier should be able to correctly classify a frog at a beach (i.e., bias-conflicting samples). Recent debiasing approaches commonly use two components for debiasing, a biased model $f_B$ and a debiased model $f_D$. $f_B$ is trained to focus on bias-aligned samples (i.e., overfitted to the bias) while $f_D$ is mainly trained with bias-conflicting samples by concentrating on samples which $f_B$ fails to learn, leading $f_D$ to be less susceptible to the dataset bias. While the state-of-the-art debiasing techniques have aimed to better train $f_D$, we focus on training $f_B$, an overlooked component until now. Our empirical analysis reveals that removing the bias-conflicting samples from the training set for $f_B$ is important for improving the debiasing performance of $f_D$. This is due to the fact that the bias-conflicting samples work as noisy samples for amplifying the bias for $f_B$ since those samples do not include the bias attribute. To this end, we propose a simple yet effective data sample selection method which removes the bias-conflicting samples to construct a bias-amplified dataset for training $f_B$. Our data sample selection method can be directly applied to existing reweighting-based debiasing approaches, obtaining consistent performance boost and achieving the state-of-the-art performance on both synthetic and real-world datasets.

View on arXiv PDF

Similar