CLAug 20, 2018

Adaptive Document Retrieval for Deep Question Answering

arXiv:1808.06528v11128 citations
Originality Highly original
AI Analysis

This addresses a key bottleneck in deep question answering systems for researchers and practitioners, offering a novel solution to improve retrieval efficiency and accuracy, though it is incremental in optimizing an existing pipeline component.

The paper tackles the problem of determining the optimal number of documents to retrieve in deep question answering systems, showing that static retrieval suffers from a noise-information trade-off and yields suboptimal results. It proposes an adaptive model that learns this number based on corpus size and query, outperforming state-of-the-art methods on multiple benchmarks with variable corpus sizes.

State-of-the-art systems in deep question answering proceed as follows: (1) an initial document retrieval selects relevant documents, which (2) are then processed by a neural network in order to extract the final answer. Yet the exact interplay between both components is poorly understood, especially concerning the number of candidate documents that should be retrieved. We show that choosing a static number of documents -- as used in prior research -- suffers from a noise-information trade-off and yields suboptimal results. As a remedy, we propose an adaptive document retrieval model. This learns the optimal candidate number for document retrieval, conditional on the size of the corpus and the query. We report extensive experimental results showing that our adaptive approach outperforms state-of-the-art methods on multiple benchmark datasets, as well as in the context of corpora with variable sizes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes