CLJun 7, 2024

ComplexTempQA:A 100m Dataset for Complex Temporal Question Answering

arXiv:2406.04866v36 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This provides a large-scale resource for evaluating temporal reasoning in AI, addressing a domain-specific need for more complex temporal QA tasks.

They tackled the challenge of temporal question answering by introducing ComplexTempQA, a dataset with over 100 million question-answer pairs that surpasses existing benchmarks in scale and complexity, covering questions spanning over two decades.

We introduce \textsc{ComplexTempQA},\footnote{Dataset and code available at: https://github.com/DataScienceUIBK/ComplexTempQA} a large-scale dataset consisting of over 100 million question-answer pairs designed to tackle the challenges in temporal question answering. \textsc{ComplexTempQA} significantly surpasses existing benchmarks in scale and scope. Utilizing Wikipedia and Wikidata, the dataset covers questions spanning over two decades and offers an unmatched scale. We introduce a new taxonomy that categorizes questions as \textit{attributes}, \textit{comparisons}, and \textit{counting} questions, revolving around events, entities, and time periods, respectively. A standout feature of \textsc{ComplexTempQA} is the high complexity of its questions, which demand reasoning capabilities for answering such as across-time comparison, temporal aggregation, and multi-hop reasoning involving temporal event ordering and entity recognition. Additionally, each question is accompanied by detailed metadata, including specific time scopes, allowing for comprehensive evaluation of temporal reasoning abilities of large language models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes