CLNov 10, 2020

Natural Language Inference in Context -- Investigating Contextual Reasoning over Long Texts

arXiv:2011.04864v157 citations
AI Analysis

This addresses the problem of testing contextual reasoning in NLP for researchers, though it is incremental as it focuses on dataset creation rather than method development.

The authors tackled the limitation of existing natural language inference datasets by introducing ConTRoL, a new dataset for contextual reasoning over long texts, showing that state-of-the-art language models perform significantly worse than educated humans on this challenging task.

Natural language inference (NLI) is a fundamental NLP task, investigating the entailment relationship between two texts. Popular NLI datasets present the task at sentence-level. While adequate for testing semantic representations, they fall short for testing contextual reasoning over long texts, which is a natural part of the human inference process. We introduce ConTRoL, a new dataset for ConTextual Reasoning over Long texts. Consisting of 8,325 expert-designed "context-hypothesis" pairs with gold labels, ConTRoL is a passage-level NLI dataset with a focus on complex contextual reasoning types such as logical reasoning. It is derived from competitive selection and recruitment test (verbal reasoning test) for police recruitment, with expert level quality. Compared with previous NLI benchmarks, the materials in ConTRoL are much more challenging, involving a range of reasoning types. Empirical results show that state-of-the-art language models perform by far worse than educated humans. Our dataset can also serve as a testing-set for downstream tasks like Checking Factual Correctness of Summaries.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes