Method Drift›Retrieval-augmented generation
TrustRAG
TrustRAG: Enhancing Robustness and Trustworthiness in Retrieval-Augmented GenerationRetrieval-augmented generation · first seen Jan 1, 2025
superseded — cited as a baseline and beaten by newer methods
4 papers critique it · 4 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites TrustRAG as a baseline.
“Existing defenses treat poisoning as a content-quality problem: filtering bad documents ragdefender, detecting anomalous signals revprag,avfilter, scoring trustworthiness trustrag, or isolating passages robustrag. These approaches share an implicit assumption: if the system can identify poisoned evidence, it will naturally avoid acting on it. We show this assumption is incorrect.”
— Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control“it makes the unrealistic assumption that malicious documents form a separate cluster in the embedding space.”
— ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search“For numerical manipulation: the poisoned passage clusters with the legitimate passage (cosine similarity 0.9997 means they are in the same cluster). Stage 1 clustering cannot separate them. Stage 2 LLM self-assessment would need to notice that `$15,500` ≠ `$15,000` across two passages in the context window, but research on LLM numerical reasoning shows this is unreliable, especially when the numbers are embedded in otherwise identical text.”
— RAGShield: Provenance-Verified Defense-in-Depth Against Knowledge Base Poisoning in Government Retrieval-Augmented Generation Systems“Although promising, these approaches have two major limitations: Majority-voting often fails under high poisoning, while heuristic and aggressive filtering may lose relevant content under low poisoning.”
— SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG
Beaten on benchmarks
Head-to-head results where a newer method reports beating TrustRAG. Values are copied from the source paper's tables — verify against the cited paper.
- After Retrieval, Before Generation: Enhancing the Trustworthiness of Large Language Models in RAG
BRIDGE_GRPO beats TrustRAG · Accuracy [GPT-3.5-turbo / TRD Simu]
77.90 vs 66.39
- After Retrieval, Before Generation: Enhancing the Trustworthiness of Large Language Models in RAG
BRIDGE_GRPO beats TrustRAG · Accuracy [Qwen 72B / TRD Simu]
85.48 vs 66.41
- Safeguarding RAG Pipelines with GMTP: A Gradient-based Masked Token Probability Method for Poisoned Document Detection
GMTP beats TrustRAG · ASR (Attack Success Rate) [PoisonedRAG attack, Generation phase, NQ dataset]
3.5 vs 11.5
- RAGShield: Provenance-Verified Defense-in-Depth Against Knowledge Base Poisoning in Government Retrieval-Augmented Generation Systems
RAGShield beats TrustRAG · Overall Attack Success Rate (%) [Synthetic corpus, T3-H T6 T-TEMP Overall]
0.0 vs 89.0
- SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG
SeconRAG beats TrustRAG · PIA_ACC [Mistral-12B, HotpotQA]
77.5 vs 75.5
- SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG
SeconRAG beats TrustRAG · PR100_ACC [Mistral-12B, HotpotQA]
75.7 vs 75.5
- SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG
SeconRAG beats TrustRAG · Clean_ACC [Mistral-12B, HotpotQA]
83.0 vs 81.0
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 26, 2026
- May 19, 2026
- May 1, 2026
- Beyond Factual GroundingBeyond Factual Grounding: The Case for Opinion-Aware Retrieval-Augmented GenerationApr 13, 2026
- RAGShieldRAGShield: Provenance-Verified Defense-in-Depth Against Knowledge Base Poisoning in Government Retrieval-Augmented Generation SystemsApr 1, 2026
- Mar 24, 2026
- Jan 13, 2026
- Oct 10, 2025
- RADARRADAR: A Risk-Aware Dynamic Multi-Agent Framework for LLM Safety Evaluation via Role-Specialized CollaborationSep 28, 2025
- RAGOriginWho Taught the Lie? Responsibility Attribution for Poisoned Knowledge in Retrieval-Augmented GenerationSep 17, 2025