LGIRNov 10, 2025

Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training

arXiv:2511.07328v11 citationsh-index: 20
Originality Highly original
AI Analysis

This addresses the need for efficient multi-step retrieval in open-domain question answering, offering a resource-efficient alternative to existing methods.

The paper tackles the problem of multi-step retrieval for complex questions in Retrieval-Augmented Generation (RAG) by proposing Q-RAG, which fine-tunes the Embedder model using reinforcement learning, achieving state-of-the-art results on Babilong and RULER benchmarks for contexts up to 10M tokens.

Retrieval-Augmented Generation (RAG) methods enhance LLM performance by efficiently filtering relevant context for LLMs, reducing hallucinations and inference cost. However, most existing RAG methods focus on single-step retrieval, which is often insufficient for answering complex questions that require multi-step search. Recently, multi-step retrieval approaches have emerged, typically involving the fine-tuning of small LLMs to perform multi-step retrieval. This type of fine-tuning is highly resource-intensive and does not enable the use of larger LLMs. In this work, we propose Q-RAG, a novel approach that fine-tunes the Embedder model for multi-step retrieval using reinforcement learning (RL). Q-RAG offers a competitive, resource-efficient alternative to existing multi-step retrieval methods for open-domain question answering and achieves state-of-the-art results on the popular long-context benchmarks Babilong and RULER for contexts up to 10M tokens.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes