CLJul 16, 2020

LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning

Jian Liu, Leyang Cui, Hanmeng Liu, Dandan Huang, Yile Wang, Yue Zhang

arXiv:2007.08124v113.486 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses the lack of datasets for logical reasoning in NLP, providing a benchmark for evaluating AI models, though it is incremental as it focuses on a specific capability.

The authors introduced LogiQA, a dataset of 8,678 expert-written questions to test logical reasoning in machine reading comprehension, and found that state-of-the-art neural models perform significantly worse than humans.

Machine reading is a fundamental task for testing the capability of natural language understanding, which is closely related to human cognition in many aspects. With the rising of deep learning techniques, algorithmic models rival human performances on simple QA, and thus increasingly challenging machine reading datasets have been proposed. Though various challenges such as evidence integration and commonsense knowledge have been integrated, one of the fundamental capabilities in human reading, namely logical reasoning, is not fully investigated. We build a comprehensive dataset, named LogiQA, which is sourced from expert-written questions for testing human Logical reasoning. It consists of 8,678 QA instances, covering multiple types of deductive reasoning. Results show that state-of-the-art neural models perform by far worse than human ceiling. Our dataset can also serve as a benchmark for reinvestigating logical AI under the deep learning NLP setting. The dataset is freely available at https://github.com/lgw863/LogiQA-dataset

View on arXiv PDF Code

Similar