Robust Multi-Agent LLMs under Byzantine Faults

Haejoon Lee, Vincent-Daniel Yun, Hyeonho Oh, Dimitra Panagou, Sai Praneeth Karimireddy

arXiv:2605.0907668.5

AI Analysis

For decentralized LLM multi-agent systems, this work provides a fully decentralized defense against Byzantine attacks, addressing a critical vulnerability in collaborative LLM networks.

The paper tackles the problem of Byzantine faults in decentralized LLM multi-agent systems, where malicious agents can degrade performance. The proposed Self-Anchored Consensus (SAC) protocol enables agents to iteratively filter unreliable messages and refine outputs, achieving robustness under adversarial conditions and improving performance on reasoning benchmarks.

Large language model (LLM) agents increasingly collaborate over peer-to-peer networks to improve their reliability. However, these same interactions can also become a source of vulnerability, as unreliable or Byzantine agents may sway neighboring agents toward incorrect conclusions and degrade overall system performance. Existing methods rely on leader-based coordination or self-reported confidence, both of which are susceptible to adversarial manipulation. We study decentralized LLM multi-agent systems (LLM-MAS) and propose Self-Anchored Consensus (SAC), a fully decentralized iterative filter-and-refine protocol in which agents iteratively exchange responses, locally evaluate and filter unreliable messages, and refine their own outputs. We present $(F{+}1)$-robustness conditions for the communication graph that ensure honest agents preserve and propagate reliable information despite Byzantine influence. Experiments on mathematical and commonsense reasoning benchmarks show that SAC effectively suppresses Byzantine influence and consistently improves performance across diverse communication topologies, whereas prior methods degrade under adversarial conditions.

View on arXiv PDF

Similar