CLIRApr 15, 2022

Improving Passage Retrieval with Zero-Shot Question Generation

MILAUW
arXiv:2204.07496v4385 citationsh-index: 116
Originality Highly original
AI Analysis

This method enhances retrieval accuracy for open-domain question answering systems, offering a simple, generalizable solution that can be applied to any retrieval method without task-specific training.

The paper tackles the problem of improving passage retrieval for open question answering by proposing a zero-shot question generation re-ranker that re-scores passages based on the probability of the input question, achieving absolute improvements of 6%-18% for unsupervised models and up to 12% for supervised models in top-20 retrieval accuracy.

We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes