RVR: Retrieve-Verify-Retrieve for Comprehensive Question Answering
This work addresses the challenge of retrieving diverse documents for queries with multiple valid answers, offering an incremental improvement over existing methods.
The paper tackles the problem of comprehensive question answering by introducing a multi-round retrieval framework called RVR, which improves answer coverage by iteratively retrieving and verifying documents, achieving at least 10% relative and 3% absolute gain in complete recall on the QAMPARI dataset.
Comprehensively retrieving diverse documents is crucial to address queries that admit a wide range of valid answers. We introduce retrieve-verify-retrieve (RVR), a multi-round retrieval framework designed to maximize answer coverage. Initially, a retriever takes the original query and returns a candidate document set, followed by a verifier that identifies a high-quality subset. For subsequent rounds, the query is augmented with previously verified documents to uncover answers that are not yet covered in previous rounds. RVR is effective even with off-the-shelf retrievers, and fine-tuning retrievers for our inference procedure brings further gains. Our method outperforms baselines, including agentic search approaches, achieving at least 10% relative and 3% absolute gain in complete recall percentage on a multi-answer retrieval dataset (QAMPARI). We also see consistent gains on two out-of-domain datasets (QUEST and WebQuestionsSP) across different base retrievers. Our work presents a promising iterative approach for comprehensive answer recall leveraging a verifier and adapting retrievers to a new inference scenario.