CLJul 6, 2023

KoRC: Knowledge oriented Reading Comprehension Benchmark for Deep Text Understanding

Tsinghua
arXiv:2307.03115v1227 citationsh-index: 30Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the problem of evaluating deep text understanding for AI researchers, but it is incremental as it builds on prior benchmarks by improving knowledge coverage and answer formats.

The authors tackled the limitations of existing benchmarks for deep text understanding by creating KoRC, a new benchmark with broad knowledge coverage and flexible answer formats, where the strongest baseline achieved only 68.3% and 30.0% F1 scores on in-distribution and out-of-distribution tests, respectively.

Deep text understanding, which requires the connections between a given document and prior knowledge beyond its text, has been highlighted by many benchmarks in recent years. However, these benchmarks have encountered two major limitations. On the one hand, most of them require human annotation of knowledge, which leads to limited knowledge coverage. On the other hand, they usually use choices or spans in the texts as the answers, which results in narrow answer space. To overcome these limitations, we build a new challenging benchmark named KoRc in this paper. Compared with previous benchmarks, KoRC has two advantages, i.e., broad knowledge coverage and flexible answer format. Specifically, we utilize massive knowledge bases to guide annotators or large language models (LLMs) to construct knowledgable questions. Moreover, we use labels in knowledge bases rather than spans or choices as the final answers. We test state-of-the-art models on KoRC and the experimental results show that the strongest baseline only achieves 68.3% and 30.0% F1 measure in the in-distribution and out-of-distribution test set, respectively. These results indicate that deep text understanding is still an unsolved challenge. The benchmark dataset, leaderboard, and baseline methods are released in https://github.com/THU-KEG/KoRC.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes