Uncovering Competing Poisoning Attacks in Retrieval-Augmented Generation
For security researchers and practitioners, this work highlights the overlooked threat of multi-adversary scenarios in RAG, showing that existing attack evaluations are insufficient.
This paper introduces the setting of competing poisoning attacks in RAG systems, where multiple adversaries target the same query. Experiments show that many single-attacker strategies degrade under competition, revealing performance inversions and limits of conventional metrics.
Retrieval-Augmented Generation (RAG) systems improve the factual grounding of large language models (LLMs) but remain vulnerable to retrieval poisoning, where adversaries seed the corpus with manipulated content. Prior work largely evaluates this threat under a simplified single-attacker assumption. In practice, however, high-value or high-visibility queries attract multiple adversaries with conflicting objectives. Motivated by real cases, we introduce the setting of competing attacks, in which multiple attackers simultaneously attempt to steer the same or closely related query toward different targets. We formalize this threat model and propose competitive effectiveness, a metric that quantifies an attacker's advantage under competition. Extensive experiments show that many strategies that succeed in the single-attacker regime degrade markedly under competition, revealing performance inversions and highlighting the limits of conventional metrics such as attack success rate and F1. Furthermore, we present PoisonArena, a standardized framework and benchmark for evaluating poisoning attacks and defenses under realistic, multi-adversary conditions.