IRCLAug 12, 2020

Fine-Grained Relevance Annotations for Multi-Task Document Ranking and Question Answering

arXiv:2008.05363v112 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of evaluating multi-task approaches for researchers in information retrieval and question answering, but it is incremental as it builds on existing datasets.

The authors tackled the lack of datasets for evaluating both document ranking and question answering by creating FiRA, a fine-grained relevance annotation dataset, and found that the TKL model achieves state-of-the-art retrieval but misses many relevant passages.

There are many existing retrieval and question answering datasets. However, most of them either focus on ranked list evaluation or single-candidate question answering. This divide makes it challenging to properly evaluate approaches concerned with ranking documents and providing snippets or answers for a given query. In this work, we present FiRA: a novel dataset of Fine-Grained Relevance Annotations. We extend the ranked retrieval annotations of the Deep Learning track of TREC 2019 with passage and word level graded relevance annotations for all relevant documents. We use our newly created data to study the distribution of relevance in long documents, as well as the attention of annotators to specific positions of the text. As an example, we evaluate the recently introduced TKL document ranking model. We find that although TKL exhibits state-of-the-art retrieval results for long documents, it misses many relevant passages.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes