Learning to Ask Unanswerable Questions for Machine Reading Comprehension
This addresses the problem of handling unanswerable questions in reading comprehension systems, which is incremental as it builds on existing datasets and models.
The paper tackles the challenge of machine reading comprehension with unanswerable questions by proposing a data augmentation technique that automatically generates relevant unanswerable questions from answerable ones and their paragraphs, resulting in improvements of 1.9 absolute F1 with BERT-base and 1.7 with BERT-large on SQuAD 2.0.
Machine reading comprehension with unanswerable questions is a challenging task. In this work, we propose a data augmentation technique by automatically generating relevant unanswerable questions according to an answerable question paired with its corresponding paragraph that contains the answer. We introduce a pair-to-sequence model for unanswerable question generation, which effectively captures the interactions between the question and the paragraph. We also present a way to construct training data for our question generation models by leveraging the existing reading comprehension dataset. Experimental results show that the pair-to-sequence model performs consistently better compared with the sequence-to-sequence baseline. We further use the automatically generated unanswerable questions as a means of data augmentation on the SQuAD 2.0 dataset, yielding 1.9 absolute F1 improvement with BERT-base model and 1.7 absolute F1 improvement with BERT-large model.