IR CLMay 22, 2019

ANTIQUE: A Non-Factoid Question Answering Benchmark

Helia Hashemi, Mohammad Aliannejadi, Hamed Zamani, W. Bruce Croft

arXiv:1905.08957v222.697 citations

Originality Synthesis-oriented

AI Analysis

This addresses the need for better evaluation resources in information retrieval for non-factoid questions, though it is incremental as it builds on existing dataset creation efforts.

The authors tackled the lack of large-scale datasets for non-factoid question answering by developing ANTIQUE, a benchmark with 2,626 real user questions and 34,011 manual relevance annotations, and provided baseline results for IR models.

Considering the widespread use of mobile and voice search, answer passage retrieval for non-factoid questions plays a critical role in modern information retrieval systems. Despite the importance of the task, the community still feels the significant lack of large-scale non-factoid question answering collections with real questions and comprehensive relevance judgments. In this paper, we develop and release a collection of 2,626 open-domain non-factoid questions from a diverse set of categories. The dataset, called ANTIQUE, contains 34,011 manual relevance annotations. The questions were asked by real users in a community question answering service, i.e., Yahoo! Answers. Relevance judgments for all the answers to each question were collected through crowdsourcing. To facilitate further research, we also include a brief analysis of the data as well as baseline results on both classical and recently developed neural IR models.

View on arXiv PDF

Similar