Method Drift›Retrieval-augmented generation
ReAct
ReAct: Synergizing Reasoning and Acting in Language ModelsRetrieval-augmented generation · first seen Oct 6, 2022
superseded — cited as a baseline and beaten by newer methods
6 papers critique it · 15 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites ReAct as a baseline.
“Existing architectures do not instantiate this full stack. For example, standard RAG systems retrieve based solely on keywords without reasoning, while existing search agents such as ReAct interleave thought and action but lack an explicit mechanism to target generated queries toward answer reasoning.”
— RAG-Gym: Systematic Optimization of Language Agents for Retrieval-Augmented Generation“There is a fundamental mismatch between the agent's actual execution history and the reshaped prompt presented to the model. This structural blindness masks crucial state parameters; specifically in RAG tasks, it leads to repetitive queries and useless interactions with search engines.”
— VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph“ReAct RAG occasionally provides marginal gains over RAG by enabling iterative retrieval and broader evidence coverage; however, it does not consistently translate additional retrieval steps into reliable performance improvements”
— EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation“However, a fundamental limitation of ReAct is that the reasoning and retrieval plan exists entirely within the LLM's context window, leading to context overflow as reasoning chains grow, plan fragmentation, and high latency from sequential execution.”
— Plan*RAG: Efficient Test-Time Planning for Retrieval Augmented Generation“However, it cannot foresee the features of different retrieval sources and heavily relies on their descriptions for selection, leading to low-quality and unstable multi-source retrieval.”
— PrefRAG: Preference-Driven Multi-Source Retrieval Augmented Generation“However, planning for complex questions is non-trivial, especially for smaller LLMs (with fewer than 10 billion parameters), which often require supervised fine-tuning”
— Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge Graphs
Beaten on benchmarks
Head-to-head results where a newer method reports beating ReAct. Values are copied from the source paper's tables — verify against the cited paper.
- DeepNote: Note-Centric Deep Retrieval-Augmented Generation
DeepNote beats ReAct · f1 [Adaptive RAG baseline ReAct]
51.1 vs 46.9
- Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
GenGround beats ReAct · F1 [F1]
52.26 vs 40.70
- Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
GenGround beats ReAct · Acc [Acc]
47.27 vs 33.10
- Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
GenGround beats ReAct · Acc† [Acc†]
55.73 vs 37.12
- EfficientGraph-RAG: Structured Retrieval-State Management for Cross-Task Retrieval-Augmented Generation
EfficientGraph-RAG beats ReAct · EM [LongBench (AVG)]
0.362 vs 0.318
- EfficientGraph-RAG: Structured Retrieval-State Management for Cross-Task Retrieval-Augmented Generation
EfficientGraph-RAG beats ReAct · F1 [LongBench (AVG)]
0.454 vs 0.414
- VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
VimRAG beats ReAct · Overall [Qwen3-VL-4B-Instruct]
45.2 vs 33.6
- VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
VimRAG beats ReAct · Overall [Qwen3-VL-8B-Instruct]
50.1 vs 37.7
- MES-RAG: Bringing Multi-modal, Entity-Storage, and Secure Enhancements to RAG
MES-RAG beats ReAct · Accuracy [ReAct baseline]
0.80 vs 0.66
- EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation
EHR-RAG beats ReAct · Macro F1 [Long Length of Stay]
70.15 vs 60.85
- EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation
EHR-RAG beats ReAct · Macro F1 [30-day Readmission]
60.91 vs 45.22
- EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation
EHR-RAG beats ReAct · Macro F1 [Acute MI]
76.29 vs 56.49
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Narrative Knowledge WeaverNarrative Knowledge Weaver: Narrative-Centric Retrieval-Augmented Reasoning for Long-Form Text UnderstandingJun 4, 2026
- Jun 4, 2026
- May 30, 2026
- LegalGraphRAGLegalGraphRAG: Multi-Agent Graph Retrieval-Augmented Generation for Reliable Legal ReasoningMay 27, 2026
- May 27, 2026
- In-Context Optimization for RAGIn-Context Optimization for Retrieval-Augmented Generation: A Gradient-Descent PerspectiveMay 25, 2026
- EfficientGraph-RAGEfficientGraph-RAG: Structured Retrieval-State Management for Cross-Task Retrieval-Augmented GenerationMay 25, 2026
- May 22, 2026
- May 12, 2026
- May 7, 2026
- Chain of Evidence (CoE)Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented GenerationMay 2, 2026
- CERTA"I Don't Know" -- Towards Appropriate Trust with Certainty-Aware Retrieval Augmented GenerationMay 1, 2026