AI GT MADec 6, 2022

What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Songyang Han, Sanbao Su, Sihong He, Shuo Han, Haizhao Yang, Shaofeng Zou, Fei Miao

arXiv:2212.02705v520.835 citationsh-index: 30Has Code

Originality Incremental advance

AI Analysis

This work addresses the robustness of MARL policies against adversarial attacks, which is critical for real-world applications like autonomous systems, but it is incremental as it builds on existing MARL frameworks.

The paper tackles the problem of multi-agent reinforcement learning (MARL) policies being vulnerable to adversarial state perturbations by proposing a State-Adversarial Markov Game (SAMG) and a new solution concept called robust agent policy, which maximizes worst-case expected state value. The results show that the proposed Robust Multi-Agent Adversarial Actor-Critic (RMA3C) algorithm outperforms existing methods under state perturbations, improving robustness.

Various methods for Multi-Agent Reinforcement Learning (MARL) have been developed with the assumption that agents' policies are based on accurate state information. However, policies learned through Deep Reinforcement Learning (DRL) are susceptible to adversarial state perturbation attacks. In this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to investigate different solution concepts of MARL under state uncertainties. Our analysis shows that the commonly used solution concepts of optimal agent policy and robust Nash equilibrium do not always exist in SAMGs. To circumvent this difficulty, we consider a new solution concept called robust agent policy, where agents aim to maximize the worst-case expected state value. We prove the existence of robust agent policy for finite state and finite action SAMGs. Additionally, we propose a Robust Multi-Agent Adversarial Actor-Critic (RMA3C) algorithm to learn robust policies for MARL agents under state uncertainties. Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies. Our code is public on https://songyanghan.github.io/what_is_solution/.

View on arXiv PDF Code

Similar