A Benchmark for Generalizable and Interpretable Temporal Question Answering over Knowledge Bases
This provides a resource for researchers to develop and evaluate methods for complex temporal reasoning in KBQA, addressing a gap in existing datasets.
The authors tackled the lack of datasets for temporal reasoning in Knowledge Base Question Answering by introducing TempQA-WD, a benchmark based on Wikidata that includes intermediate SPARQL queries and generalizes to multiple knowledge bases.
Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-WD, to encourage research in extending the present approaches to target a more challenging set of complex reasoning tasks. Specifically, our benchmark is a temporal question answering dataset with the following advantages: (a) it is based on Wikidata, which is the most frequently curated, openly available knowledge base, (b) it includes intermediate sparql queries to facilitate the evaluation of semantic parsing based approaches for KBQA, and (c) it generalizes to multiple knowledge bases: Freebase and Wikidata. The TempQA-WD dataset is available at https://github.com/IBM/tempqa-wd.