Explaining Answers with Entailment Trees
This work provides a new dataset and baselines for generating systematic explanations in QA, offering a new avenue for the community, though it is incremental in building on existing explanation methods.
The paper tackles the problem of explaining answers in open-domain QA by generating entailment trees that show reasoning steps from known facts to the answer, rather than just providing textual evidence. It introduces ENTAILMENTBANK, the first dataset with multistep entailment trees, and shows that a strong language model can partially solve related tasks, achieving 35% perfect trees in one scenario.
Our goal, in the context of open-domain textual question-answering (QA), is to explain answers by showing the line of reasoning from what is known to the answer, rather than simply showing a fragment of textual evidence (a "rationale'"). If this could be done, new opportunities for understanding and debugging the system's reasoning become possible. Our approach is to generate explanations in the form of entailment trees, namely a tree of multipremise entailment steps from facts that are known, through intermediate conclusions, to the hypothesis of interest (namely the question + answer). To train a model with this skill, we created ENTAILMENTBANK, the first dataset to contain multistep entailment trees. Given a hypothesis (question + answer), we define three increasingly difficult explanation tasks: generate a valid entailment tree given (a) all relevant sentences (b) all relevant and some irrelevant sentences, or (c) a corpus. We show that a strong language model can partially solve these tasks, in particular when the relevant sentences are included in the input (e.g., 35% of trees for (a) are perfect), and with indications of generalization to other domains. This work is significant as it provides a new type of dataset (multistep entailments) and baselines, offering a new avenue for the community to generate richer, more systematic explanations.