CLJun 11, 2018

Know What You Don't Know: Unanswerable Questions for SQuAD

arXiv:1806.03822v13314 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of improving natural language understanding systems for researchers and practitioners by creating a more robust benchmark, though it is incremental as it builds on the existing SQuAD dataset.

The paper tackles the problem of extractive reading comprehension systems making unreliable guesses on unanswerable questions by introducing SQuAD 2.0, a dataset combining existing data with over 50,000 adversarially written unanswerable questions, resulting in a strong neural system's F1 score dropping from 86% on SQuAD 1.1 to 66% on SQuAD 2.0.

Extractive reading comprehension systems can often locate the correct answer to a question in a context document, but they also tend to make unreliable guesses on questions for which the correct answer is not stated in the context. Existing datasets either focus exclusively on answerable questions, or use automatically generated unanswerable questions that are easy to identify. To address these weaknesses, we present SQuAD 2.0, the latest version of the Stanford Question Answering Dataset (SQuAD). SQuAD 2.0 combines existing SQuAD data with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD 2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuAD 2.0 is a challenging natural language understanding task for existing models: a strong neural system that gets 86% F1 on SQuAD 1.1 achieves only 66% F1 on SQuAD 2.0.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes