CL AIDec 14, 2021

You Only Need One Model for Open-domain Question Answering

Haejun Lee, Akhil Kedia, Jongwon Lee, Ashwin Paranjape, Christopher D. Manning, Kyoung-Gu Woo

arXiv:2112.07381v222.8295 citationsh-index: 132

Originality Highly original

AI Analysis

This addresses the problem of model complexity and training inefficiency for researchers and practitioners in NLP, offering a more streamlined approach to open-domain QA.

The paper tackles the inefficiency of using separate models for retrieval, reranking, and reading in open-domain question answering by proposing a single end-to-end model that integrates these tasks through internal attention mechanisms, achieving improvements of 1.0 and 0.7 exact match scores on Natural Questions and TriviaQA datasets.

Recent approaches to Open-domain Question Answering refer to an external knowledge base using a retriever model, optionally rerank passages with a separate reranker model and generate an answer using another reader model. Despite performing related tasks, the models have separate parameters and are weakly-coupled during training. We propose casting the retriever and the reranker as internal passage-wise attention mechanisms applied sequentially within the transformer architecture and feeding computed representations to the reader, with the hidden representations progressively refined at each stage. This allows us to use a single question answering model trained end-to-end, which is a more efficient use of model capacity and also leads to better gradient flow. We present a pre-training method to effectively train this architecture and evaluate our model on the Natural Questions and TriviaQA open datasets. For a fixed parameter budget, our model outperforms the previous state-of-the-art model by 1.0 and 0.7 exact match scores.

View on arXiv PDF

Similar