IndexRAG: Bridging Facts for Cross-Document Reasoning at Index Time

arXiv:2603.1641555.3h-index: 1

AI Analysis

This addresses the problem of efficient multi-hop QA for users needing fast, accurate answers without additional training, though it is incremental as it builds on existing RAG methods.

The paper tackles multi-hop question answering by shifting cross-document reasoning from online inference to offline indexing, resulting in an average F1 improvement of 4.6 points over Naive RAG on three benchmarks.

Multi-hop question answering (QA) requires reasoning across multiple documents, yet existing retrieval-augmented generation (RAG) approaches address this either through graph-based methods requiring additional online processing or iterative multi-step reasoning. We present IndexRAG, a novel approach that shifts cross-document reasoning from online inference to offline indexing. IndexRAG identifies bridge entities shared across documents and generates bridging facts as independently retrievable units, requiring no additional training or fine-tuning. Experiments on three widely-used multi-hop QA benchmarks (HotpotQA, 2WikiMultiHopQA, MuSiQue) show that IndexRAG improves F1 over Naive RAG by 4.6 points on average, while requiring only single-pass retrieval and a single LLM call at inference time. When combined with IRCoT, IndexRAG outperforms all graph-based baselines on average, including HippoRAG and FastGraphRAG, while relying solely on flat retrieval. Our code will be released upon acceptance.

View on arXiv PDF

Similar