CLIRJul 9, 2025

Shifting from Ranking to Set Selection for Retrieval Augmented Generation

arXiv:2507.06838v29 citationsh-index: 4Has CodeACL
Originality Incremental advance
AI Analysis

This addresses the limitation of individual passage ranking in RAG systems for multi-hop question answering, offering an incremental improvement over existing methods.

The paper tackles the problem of ensuring retrieved passages collectively form a comprehensive set for complex queries in Retrieval-Augmented Generation, proposing SETR, which outperforms existing rerankers in multi-hop benchmarks with improved answer correctness and retrieval quality.

Retrieval in Retrieval-Augmented Generation(RAG) must ensure that retrieved passages are not only individually relevant but also collectively form a comprehensive set. Existing approaches primarily rerank top-k passages based on their individual relevance, often failing to meet the information needs of complex queries in multi-hop question answering. In this work, we propose a set-wise passage selection approach and introduce SETR, which explicitly identifies the information requirements of a query through Chain-of-Thought reasoning and selects an optimal set of passages that collectively satisfy those requirements. Experiments on multi-hop RAG benchmarks show that SETR outperforms both proprietary LLM-based rerankers and open-source baselines in terms of answer correctness and retrieval quality, providing an effective and efficient alternative to traditional rerankers in RAG systems. The code is available at https://github.com/LGAI-Research/SetR

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes