CLJun 11, 2019

Retrieve, Read, Rerank: Towards End-to-End Multi-Document Reading Comprehension

arXiv:1906.04618v11122 citations
Originality Incremental advance
AI Analysis

This addresses the problem of inefficient and disjointed training in multi-document QA systems for researchers and practitioners, offering an incremental improvement over existing pipeline methods.

The paper tackles the inefficiency and training limitations of pipeline systems for multi-document reading comprehension by introducing RE³QA, a unified model that integrates retrieving, reading, and reranking with shared representations and end-to-end training. It outperforms baselines and achieves state-of-the-art results on TriviaQA and SQuAD variants.

This paper considers the reading comprehension task in which multiple documents are given as input. Prior work has shown that a pipeline of retriever, reader, and reranker can improve the overall performance. However, the pipeline system is inefficient since the input is re-encoded within each module, and is unable to leverage upstream components to help downstream training. In this work, we present RE$^3$QA, a unified question answering model that combines context retrieving, reading comprehension, and answer reranking to predict the final answer. Unlike previous pipelined approaches, RE$^3$QA shares contextualized text representation across different components, and is carefully designed to use high-quality upstream outputs (e.g., retrieved context or candidate answers) for directly supervising downstream modules (e.g., the reader or the reranker). As a result, the whole network can be trained end-to-end to avoid the context inconsistency problem. Experiments show that our model outperforms the pipelined baseline and achieves state-of-the-art results on two versions of TriviaQA and two variants of SQuAD.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes