QED: A Framework and Dataset for Explanations in Question Answering
This work addresses the need for debuggability and trust in QA systems, though it is incremental as it builds on existing datasets and methods.
The authors tackled the problem of making question answering systems more transparent by proposing QED, a framework for generating explanations based on formal semantics, and created an annotated dataset from Google Natural Questions. They reported that training on QED data improved QA performance and that explanations helped untrained raters spot errors in a neural QA baseline.
A question answering system that in addition to providing an answer provides an explanation of the reasoning that leads to that answer has potential advantages in terms of debuggability, extensibility and trust. To this end, we propose QED, a linguistically informed, extensible framework for explanations in question answering. A QED explanation specifies the relationship between a question and answer according to formal semantic notions such as referential equality, sentencehood, and entailment. We describe and publicly release an expert-annotated dataset of QED explanations built upon a subset of the Google Natural Questions dataset, and report baseline models on two tasks -- post-hoc explanation generation given an answer, and joint question answering and explanation generation. In the joint setting, a promising result suggests that training on a relatively small amount of QED data can improve question answering. In addition to describing the formal, language-theoretic motivations for the QED approach, we describe a large user study showing that the presence of QED explanations significantly improves the ability of untrained raters to spot errors made by a strong neural QA baseline.