IRCLJan 24, 2025

Chain-of-Retrieval Augmented Generation

Microsoft
arXiv:2501.14342v343 citationsh-index: 22
Originality Incremental advance
AI Analysis

This addresses the limitation of single-retrieval RAG methods for complex queries, offering a more effective approach for knowledge-intensive tasks, though it is incremental in advancing RAG techniques.

The paper tackles the problem of complex queries in retrieval-augmented generation (RAG) models by introducing CoRAG, which dynamically retrieves and reasons step-by-step, resulting in over 10 points improvement in EM score on multi-hop QA tasks and achieving state-of-the-art performance on the KILT benchmark.

This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer. Conventional RAG methods usually perform a single retrieval step before the generation process, which limits their effectiveness in addressing complex queries due to imperfect retrieval results. In contrast, our proposed method, CoRAG (Chain-of-Retrieval Augmented Generation), allows the model to dynamically reformulate the query based on the evolving state. To train CoRAG effectively, we utilize rejection sampling to automatically generate intermediate retrieval chains, thereby augmenting existing RAG datasets that only provide the correct final answer. At test time, we propose various decoding strategies to scale the model's test-time compute by controlling the length and number of sampled retrieval chains. Experimental results across multiple benchmarks validate the efficacy of CoRAG, particularly in multi-hop question answering tasks, where we observe more than 10 points improvement in EM score compared to strong baselines. On the KILT benchmark, CoRAG establishes a new state-of-the-art performance across a diverse range of knowledge-intensive tasks. Furthermore, we offer comprehensive analyses to understand the scaling behavior of CoRAG, laying the groundwork for future research aimed at developing factual and grounded foundation models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes