End-to-End Self-Debiasing Framework for Robust NLU Training
This addresses robustness issues in NLU for AI practitioners, though it is incremental as it builds on existing debiasing approaches.
The paper tackles the problem of dataset biases in Natural Language Understanding (NLU) models, which cause poor out-of-distribution (OOD) performance, by introducing a self-debiasing framework that trains a bias model using shallow representations; it achieves competitive OOD results on three tasks, significantly outperforming other methods on two while maintaining high in-distribution performance.
Existing Natural Language Understanding (NLU) models have been shown to incorporate dataset biases leading to strong performance on in-distribution (ID) test sets but poor performance on out-of-distribution (OOD) ones. We introduce a simple yet effective debiasing framework whereby the shallow representations of the main model are used to derive a bias model and both models are trained simultaneously. We demonstrate on three well studied NLU tasks that despite its simplicity, our method leads to competitive OOD results. It significantly outperforms other debiasing approaches on two tasks, while still delivering high in-distribution performance.