CLOct 11, 2022

Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering

Hao Cheng, Hao Fang, Xiaodong Liu, Jianfeng Gao

Microsoft

arXiv:2210.05156v221.4224 citationsh-index: 59Has Code

Originality Highly original

AI Analysis

This addresses the problem of inefficient and underperforming dense retrieval models for open-domain question answering, offering a more robust and parameter-efficient solution.

The paper tackles the parameter inefficiency and performance limitations of bi-encoder dense retrievers in open-domain question answering by proposing TASER, a new architecture that interleaves shared and specialized blocks in a single encoder, achieving superior accuracy over BM25 while using about 60% of the parameters.

Given its effectiveness on knowledge-intensive natural language processing tasks, dense retrieval models have become increasingly popular. Specifically, the de-facto architecture for open-domain question answering uses two isomorphic encoders that are initialized from the same pretrained model but separately parameterized for questions and passages. This bi-encoder architecture is parameter-inefficient in that there is no parameter sharing between encoders. Further, recent studies show that such dense retrievers underperform BM25 in various settings. We thus propose a new architecture, Task-aware Specialization for dense Retrieval (TASER), which enables parameter sharing by interleaving shared and specialized blocks in a single encoder. Our experiments on five question answering datasets show that TASER can achieve superior accuracy, surpassing BM25, while using about 60% of the parameters as bi-encoder dense retrievers. In out-of-domain evaluations, TASER is also empirically more robust than bi-encoder dense retrievers. Our code is available at https://github.com/microsoft/taser.

View on arXiv PDF Code

Similar