CL AI LGApr 15, 2017

RACE: Large-scale ReAding Comprehension Dataset From Examinations

Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, Eduard Hovy

arXiv:1704.04683v535.21635 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This dataset addresses the need for a more challenging benchmark in machine reading comprehension, particularly for evaluating reasoning abilities, though it is incremental as it builds on existing datasets by focusing on reasoning-heavy questions.

The authors introduced RACE, a large-scale reading comprehension dataset derived from English exams for Chinese students, containing 28,000 passages and 100,000 questions with a high proportion requiring reasoning, where state-of-the-art models achieve 43% accuracy compared to a human ceiling of 95%.

We present RACE, a new dataset for benchmark evaluation of methods in the reading comprehension task. Collected from the English exams for middle and high school Chinese students in the age range between 12 to 18, RACE consists of near 28,000 passages and near 100,000 questions generated by human experts (English instructors), and covers a variety of topics which are carefully designed for evaluating the students' ability in understanding and reasoning. In particular, the proportion of questions that requires reasoning is much larger in RACE than that in other benchmark datasets for reading comprehension, and there is a significant gap between the performance of the state-of-the-art models (43%) and the ceiling human performance (95%). We hope this new dataset can serve as a valuable resource for research and evaluation in machine comprehension. The dataset is freely available at http://www.cs.cmu.edu/~glai1/data/race/ and the code is available at https://github.com/qizhex/RACE_AR_baselines.

View on arXiv PDF Code

Similar