Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering
This work addresses efficiency challenges in open-domain QA pipelines for researchers and practitioners, but it is incremental as it focuses on optimizing existing methods rather than introducing a new paradigm.
The paper tackles the problem of scaling open-domain question answering by investigating sentence selection techniques to prune retrieved text before applying expensive reading comprehension models, showing that lightweight QA models perform well but retrieval-based models are faster, with an ensemble balancing speed and performance.
Current methods in open-domain question answering (QA) usually employ a pipeline of first retrieving relevant documents, then applying strong reading comprehension (RC) models to that retrieved text. However, modern RC models are complex and expensive to run, so techniques to prune the space of retrieved text are critical to allow this approach to scale. In this paper, we focus on approaches which apply an intermediate sentence selection step to address this issue, and investigate the best practices for this approach. We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question. We examine trade-offs between processing speed and task performance in these two approaches, and demonstrate an ensemble module that represents a hybrid of the two. From experiments on Open-SQuAD and TriviaQA, we show that very lightweight QA models can do well at this task, but retrieval-based models are faster still. An ensemble module we describe balances between the two and generalizes well cross-domain.