CLSep 27, 2018

A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC

arXiv:1809.10735v21147 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work provides a comparative analysis for researchers in natural language processing, identifying dataset strengths and limitations to guide model development, but it is incremental as it builds on existing datasets without introducing new methods.

The paper compares three question-answering datasets (SQuAD 2.0, QuAC, CoQA) on features like unanswerable questions and multi-turn interactions, finding complementary coverage but weak abstractive answer support, and shows improved baseline results on SQuAD 2.0 and CoQA with an adaptable extractive model.

We compare three new datasets for question answering: SQuAD 2.0, QuAC, and CoQA, along several of their new features: (1) unanswerable questions, (2) multi-turn interactions, and (3) abstractive answers. We show that the datasets provide complementary coverage of the first two aspects, but weak coverage of the third. Because of the datasets' structural similarity, a single extractive model can be easily adapted to any of the datasets and we show improved baseline results on both SQuAD 2.0 and CoQA. Despite the similarity, models trained on one dataset are ineffective on another dataset, but we find moderate performance improvement through pretraining. To encourage cross-evaluation, we release code for conversion between datasets at https://github.com/my89/co-squac .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes