ReAct (Retrieval-augmented generation): superseded — cited as a baseline and beaten by newer methods. 6 paper(s) critique it, 15 beat it on benchmarks — #12 of 1179 most-superseded. Sub-problem: cluster led by RAG. Newer alternatives in the same sub-problem include Narrative Knowledge Weaver, IA-RAG, MemGraphRAG, LegalGraphRAG, R2C.

Method Drift›Retrieval-augmented generation

Superseded baseline#12 of 1,179 most-superseded

ReAct

ReAct: Synergizing Reasoning and Acting in Language Models

Retrieval-augmented generation · first seen Oct 6, 2022

superseded — cited as a baseline and beaten by newer methods

6 papers critique it · 15 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites ReAct as a baseline.

“Existing architectures do not instantiate this full stack. For example, standard RAG systems retrieve based solely on keywords without reasoning, while existing search agents such as ReAct interleave thought and action but lack an explicit mechanism to target generated queries toward answer reasoning.”
— RAG-Gym: Systematic Optimization of Language Agents for Retrieval-Augmented Generation
“There is a fundamental mismatch between the agent's actual execution history and the reshaped prompt presented to the model. This structural blindness masks crucial state parameters; specifically in RAG tasks, it leads to repetitive queries and useless interactions with search engines.”
— VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
“ReAct RAG occasionally provides marginal gains over RAG by enabling iterative retrieval and broader evidence coverage; however, it does not consistently translate additional retrieval steps into reliable performance improvements”
— EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation
“However, a fundamental limitation of ReAct is that the reasoning and retrieval plan exists entirely within the LLM's context window, leading to context overflow as reasoning chains grow, plan fragmentation, and high latency from sequential execution.”
— Plan*RAG: Efficient Test-Time Planning for Retrieval Augmented Generation
“However, it cannot foresee the features of different retrieval sources and heavily relies on their descriptions for selection, leading to low-quality and unstable multi-source retrieval.”
— PrefRAG: Preference-Driven Multi-Source Retrieval Augmented Generation
“However, planning for complex questions is non-trivial, especially for smaller LLMs (with fewer than 10 billion parameters), which often require supervised fine-tuning”
— Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge Graphs

Beaten on benchmarks

Head-to-head results where a newer method reports beating ReAct. Values are copied from the source paper's tables — verify against the cited paper.

DeepNote beats ReAct · f1 [Adaptive RAG baseline ReAct]
51.1 vs 46.9
DeepNote: Note-Centric Deep Retrieval-Augmented Generation
GenGround beats ReAct · F1 [F1]
52.26 vs 40.70
Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
GenGround beats ReAct · Acc [Acc]
47.27 vs 33.10
Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
GenGround beats ReAct · Acc† [Acc†]
55.73 vs 37.12
Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
EfficientGraph-RAG beats ReAct · EM [LongBench (AVG)]
0.362 vs 0.318
EfficientGraph-RAG: Structured Retrieval-State Management for Cross-Task Retrieval-Augmented Generation
EfficientGraph-RAG beats ReAct · F1 [LongBench (AVG)]
0.454 vs 0.414
EfficientGraph-RAG: Structured Retrieval-State Management for Cross-Task Retrieval-Augmented Generation
VimRAG beats ReAct · Overall [Qwen3-VL-4B-Instruct]
45.2 vs 33.6
VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
VimRAG beats ReAct · Overall [Qwen3-VL-8B-Instruct]
50.1 vs 37.7
VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
MES-RAG beats ReAct · Accuracy [ReAct baseline]
0.80 vs 0.66
MES-RAG: Bringing Multi-modal, Entity-Storage, and Secure Enhancements to RAG
EHR-RAG beats ReAct · Macro F1 [Long Length of Stay]
70.15 vs 60.85
EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation
EHR-RAG beats ReAct · Macro F1 [30-day Readmission]
60.91 vs 45.22
EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation
EHR-RAG beats ReAct · Macro F1 [Acute MI]
76.29 vs 56.49
EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.