Method Drift›Retrieval-augmented generation
RobustRAG
Retrieval-augmented generation
superseded — cited as a baseline and beaten by newer methods
13 papers critique it · 8 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites RobustRAG as a baseline.
“one cannot expect that LLMs always generate correct judgments, and thus the manipulated final input might lose crucial information or include wrong information before conducting answer generation”
— OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG“Although existing defenses can mitigate the impact of poisoning attacks on RAG systems to some extent, they remain vulnerable to advanced attacks, where the attacker craft sophisticated strategies to bypass current safeguards”
— Traceback of Poisoning Attacks to Retrieval-Augmented Generation“These approaches share an implicit assumption: if the system can identify poisoned evidence, it will naturally avoid acting on it. We show this assumption is incorrect. The deeper problem is the monitoring-control gap: models may detect contradictions and untrustworthy evidence, yet this awareness does not reliably govern their final output.”
— Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control“they suffer from additional computational overhead, such as multiple LLM inferences or substantial memory consumption.”
— Rescuing the Unpoisoned: Efficient Defense against Knowledge Corruption Attacks on RAG Systems“For instance, RobustRAG~xiang2024certifiably fails when an attacker poisons more than half of the retrieved texts for a target question.”
— Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation“RobustRAG~xiang2024certifiably follows an "isolate-then-aggregate" pipeline, where answers are independently generated for each retrieved document and then aggregated. This approach not only incurs high inference costs but also becomes ineffective when the proportion of negative documents is high.”
— RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects“RobustRAG~xiang2024RobustRAG, as the major existing RAG framework for adversarial robustness, suffers from limited performance in benign (no-attack) scenarios and struggles in complex generation tasks.”
— ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search“For numerical manipulation: the poisoned passage is semantically identical to the original and will appear in top-k with high rank. The isolated response from the poisoned passage will say `$15,500` while responses from other passages (if any discuss the same topic) will say `$15,000.` However, in a mixed corpus with 1,000+ passages, the probability that multiple top-k passages discuss the exact same numerical claim is low. The poisoned passage often stands alone on its topic, making majority vote ineffective because there is no majority to outvote it.”
— RAGShield: Provenance-Verified Defense-in-Depth Against Knowledge Base Poisoning in Government Retrieval-Augmented Generation Systems“the aforementioned defense methods necessitate the integration of additional large models, incurring considerable overheads. Meanwhile, it is difficult to promptly assess whether the current response of RAG is trustworthy or not. Moreover, they're all ``best-effort'' schemes, offering no guarantee on the defense effectiveness.”
— RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis“However, CAR~weller-etal-2024-defending and RobustRAG~xiang2024certifiably require multiple rounds of model inference, leading to inefficiency.”
— CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG“RobustRAG extracts document keywords as context, which also results in information loss.”
— BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation“Heuristic aggregation or filtering~xiang2024certifiably often causes utility loss, while optimization-based consistency selection~shenreliabilityrag typically relies on approximations without strong guarantees.”
— RADAR: Defending RAG Dynamically against Retrieval Corruption
Beaten on benchmarks
Head-to-head results where a newer method reports beating RobustRAG. Values are copied from the source paper's tables — verify against the cited paper.
- After Retrieval, Before Generation: Enhancing the Trustworthiness of Large Language Models in RAG
BRIDGE_GRPO beats RobustRAG · Accuracy [GPT-3.5-turbo / TRD Simu]
77.90 vs 62.50
- Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation
alg beats RobustRAG · DACC [NQ dataset / PRAGB attack]
0.99 vs 0.75
- Who Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented Generation
alg beats RobustRAG · DACC [HotpotQA dataset / PRAGB attack]
1.00 vs 0.88
- RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects
RbFT beats RobustRAG · EM [Llama, Clean (τ=0)]
48.4 vs 32.1
- RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects
RbFT beats RobustRAG · EM [Llama, Normal (τ=0.4) - Noisy]
44.5 vs 26.7
- RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects
RbFT beats RobustRAG · EM [Llama, Hard (τ=1.0) - Counterfactual]
33.8 vs 10.1
- RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects
RbFT beats RobustRAG · EM [Qwen, Clean (τ=0)]
45.4 vs 25.8
- RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects
RbFT beats RobustRAG · EM [Qwen, Hard (τ=1.0) - Counterfactual]
25.1 vs 7.7
- Safeguarding RAG Pipelines with GMTP: A Gradient-based Masked Token Probability Method for Poisoned Document Detection
GMTP beats RobustRAG · ASR (Attack Success Rate) [PoisonedRAG attack, Generation phase, NQ dataset]
3.5 vs 54.0
- ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search
MIS beats RobustRAG · RQA @Pos 1 [Mistral-7B]
68 vs 53
- ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search
MIS beats RobustRAG · RQA @Pos 10 [Mistral-7B]
60 vs 55
- ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search
MIS beats RobustRAG · NQ @Pos 1 [Mistral-7B]
54.8 vs 44.4
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 26, 2026
- May 19, 2026
- May 1, 2026
- Beyond Factual GroundingBeyond Factual Grounding: The Case for Opinion-Aware Retrieval-Augmented GenerationApr 13, 2026
- RAGShieldRAGShield: Provenance-Verified Defense-in-Depth Against Knowledge Base Poisoning in Government Retrieval-Augmented Generation SystemsApr 1, 2026
- Mar 24, 2026
- Jan 13, 2026
- Oct 10, 2025
- RADARRADAR: A Risk-Aware Dynamic Multi-Agent Framework for LLM Safety Evaluation via Role-Specialized CollaborationSep 28, 2025
- RAGOriginWho Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented GenerationSep 17, 2025