Controllable Feature Whitening for Hyperparameter-Free Bias Mitigation
This addresses bias mitigation in AI for improving fairness and reliability, offering a hyperparameter-free approach that is incremental in its method.
The paper tackles the problem of deep neural networks learning spurious correlations from biased datasets by proposing a controllable feature whitening framework that removes linear correlations between target and bias features, achieving bias mitigation without regularization or adversarial learning and outperforming existing methods on four benchmark datasets.
As the use of artificial intelligence rapidly increases, the development of trustworthy artificial intelligence has become important. However, recent studies have shown that deep neural networks are susceptible to learn spurious correlations present in datasets. To improve the reliability, we propose a simple yet effective framework called controllable feature whitening. We quantify the linear correlation between the target and bias features by the covariance matrix, and eliminate it through the whitening module. Our results systemically demonstrate that removing the linear correlations between features fed into the last linear classifier significantly mitigates the bias, while avoiding the need to model intractable higher-order dependencies. A particular advantage of the proposed method is that it does not require regularization terms or adversarial learning, which often leads to unstable optimization in practice. Furthermore, we show that two fairness criteria, demographic parity and equalized odds, can be effectively handled by whitening with the re-weighted covariance matrix. Consequently, our method controls the trade-off between the utility and fairness of algorithms by adjusting the weighting coefficient. Finally, we validate that our method outperforms existing approaches on four benchmark datasets: Corrupted CIFAR-10, Biased FFHQ, WaterBirds, and Celeb-A.