IRAIMay 30

Critic-R: Improving Agentic Search using Instruction-tuned Retrievers with Natural Language Introspective Feedback

arXiv:2606.0059078.0h-index: 13
Predicted impact top 30% in IR · last 90 daysOriginality Incremental advance
AI Analysis

For developers of agentic search systems, Critic-R provides a practical framework to improve retrieval without costly annotations or co-training, addressing a key bottleneck in real-world deployment.

Critic-R improves agentic search by using a critic model that evaluates the agent's introspective reasoning to refine queries and optimize retrievers without manual annotations, achieving significant gains in retrieval quality and answer accuracy on four multi-hop QA benchmarks.

Agentic search systems iteratively interact with retrieval models to answer complex queries. Despite substantial progress, optimizing retrievers for agentic search remains challenging, often requiring heavy co-training or gold-standard annotations that limit real-world applicability. We propose Critic-R, a framework that explicitly closes the feedback loop between the reasoning agent and the retrieval model during both inference and training. Critic-R introduces a critic model that evaluates the agent's introspective reasoning trace after consuming retrieved evidence to determine whether the retrieved context sufficiently supports the next reasoning step. Critic-R has two complementary mechanisms: Critic-R-Zero, an inference-time query refinement loop that iteratively rewrites queries and retrieval instructions, and Critic-Embed, an optimization approach for retrieval models that leverages successful and failed refinement trajectories as automatic supervision without requiring manual relevance annotation. We evaluate Critic-R on HotpotQA, 2WikiMultihopQA, MuSiQue, and Bamboogle. Results show that Critic-R significantly improves both retrieval quality and downstream answer accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes