Towards Robust Extractive Question Answering Models: Rethinking the Training Methodology
This addresses the need for more reliable EQA models in real-world applications by enhancing robustness, though it is incremental as it builds on existing training methodologies.
The paper tackles the problem of improving robustness in Extractive Question Answering models against distribution shifts and adversarial attacks by proposing a novel training method, resulting in a 5.7 F1 score improvement across testing sets and significantly enhanced robustness with only about a third of the performance decrease compared to default models.
This paper proposes a novel training method to improve the robustness of Extractive Question Answering (EQA) models. Previous research has shown that existing models, when trained on EQA datasets that include unanswerable questions, demonstrate a significant lack of robustness against distribution shifts and adversarial attacks. Despite this, the inclusion of unanswerable questions in EQA training datasets is essential for ensuring real-world reliability. Our proposed training method includes a novel loss function for the EQA problem and challenges an implicit assumption present in numerous EQA datasets. Models trained with our method maintain in-domain performance while achieving a notable improvement on out-of-domain datasets. This results in an overall F1 score improvement of 5.7 across all testing sets. Furthermore, our models exhibit significantly enhanced robustness against two types of adversarial attacks, with a performance decrease of only about a third compared to the default models.