CLOct 1, 2020

LiveQA: A Question Answering Dataset over Sports Live

Qianying Liu, Sicong Jiang, Yizhong Wang, Sujian Li

arXiv:2010.00526v131.1997 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This dataset addresses a challenging problem for AI researchers in sports analytics by providing a benchmark for timeline-based reasoning, though it is incremental as it builds on existing QA datasets.

The authors introduced LiveQA, a new question answering dataset with 117k multiple-choice questions from NBA game broadcasts, which tests reasoning across timelines and events, and found that a strong baseline model achieved only 53.1% accuracy, failing to beat a dominant option rule.

In this paper, we introduce LiveQA, a new question answering dataset constructed from play-by-play live broadcast. It contains 117k multiple-choice questions written by human commentators for over 1,670 NBA games, which are collected from the Chinese Hupu (https://nba.hupu.com/games) website. Derived from the characteristics of sports games, LiveQA can potentially test the reasoning ability across timeline-based live broadcasts, which is challenging compared to the existing datasets. In LiveQA, the questions require understanding the timeline, tracking events or doing mathematical computations. Our preliminary experiments show that the dataset introduces a challenging problem for question answering models, and a strong baseline model only achieves the accuracy of 53.1\% and cannot beat the dominant option rule. We release the code and data of this paper for future research.

View on arXiv PDF Code

Similar