Enhancing Robustness of LLM-Driven Multi-Agent Systems through Randomized Smoothing
This work addresses safety concerns for deploying LLM-based multi-agent systems in high-stakes real-world environments, though it appears incremental as it adapts an existing robustness technique to a new context.
The paper tackles the problem of adversarial attacks on large language model (LLM) multi-agent systems in safety-critical domains like aerospace by applying randomized smoothing to provide probabilistic guarantees on agent decisions. Simulation results show the method prevents adversarial behavior propagation and hallucinations while maintaining consensus performance.
This paper presents a defense framework for enhancing the safety of large language model (LLM) empowered multi-agent systems (MAS) in safety-critical domains such as aerospace. We apply randomized smoothing, a statistical robustness certification technique, to the MAS consensus context, enabling probabilistic guarantees on agent decisions under adversarial influence. Unlike traditional verification methods, our approach operates in black-box settings and employs a two-stage adaptive sampling mechanism to balance robustness and computational efficiency. Simulation results demonstrate that our method effectively prevents the propagation of adversarial behaviors and hallucinations while maintaining consensus performance. This work provides a practical and scalable path toward safe deployment of LLM-based MAS in real-world, high-stakes environments.