CL LGOct 14, 2022

Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding

Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang

arXiv:2210.07547v124.1295 citationsh-index: 70Has Code

Originality Incremental advance

AI Analysis

This addresses dataset bias for NLP practitioners by offering a more efficient alternative to two-stage methods, though it is incremental as it builds on representation normalization techniques.

The paper tackles dataset bias in fine-tuned models by proposing Kernel-Whitening, an end-to-end method that uses isotropic sentence embedding to eliminate bias, significantly improving BERT's performance on out-of-distribution datasets while maintaining in-distribution accuracy.

Dataset bias has attracted increasing attention recently for its detrimental effect on the generalization ability of fine-tuned models. The current mainstream solution is designing an additional shallow model to pre-identify biased instances. However, such two-stage methods scale up the computational complexity of training process and obstruct valid feature information while mitigating bias. To address this issue, we utilize the representation normalization method which aims at disentangling the correlations between features of encoded sentences. We find it also promising in eliminating the bias problem by providing isotropic data distribution. We further propose Kernel-Whitening, a Nystrom kernel approximation method to achieve more thorough debiasing on nonlinear spurious correlations. Our framework is end-to-end with similar time consumption to fine-tuning. Experiments show that Kernel-Whitening significantly improves the performance of BERT on out-of-distribution datasets while maintaining in-distribution accuracy.

View on arXiv PDF Code

Similar