LGAug 19, 2023

DPMAC: Differentially Private Communication for Cooperative Multi-Agent Reinforcement Learning

Canzhe Zhao, Yanjie Ze, Jing Dong, Baoxiang Wang, Shuai Li

arXiv:2308.09902v15.34 citationsh-index: 20Has Code

Originality Incremental advance

AI Analysis

This work addresses privacy concerns in MARL for applications like robotics or autonomous systems, though it is incremental as it builds on existing MARL and differential privacy techniques.

The paper tackles the problem of protecting sensitive information in cooperative multi-agent reinforcement learning (MARL) by proposing DPMAC, a differentially private communication algorithm that ensures rigorous privacy guarantees while maintaining cooperation. The result shows a clear advantage over baseline methods in privacy-preserving scenarios, as demonstrated through extensive experiments.

Communication lays the foundation for cooperation in human society and in multi-agent reinforcement learning (MARL). Humans also desire to maintain their privacy when communicating with others, yet such privacy concern has not been considered in existing works in MARL. To this end, we propose the \textit{differentially private multi-agent communication} (DPMAC) algorithm, which protects the sensitive information of individual agents by equipping each agent with a local message sender with rigorous $(ε, δ)$-differential privacy (DP) guarantee. In contrast to directly perturbing the messages with predefined DP noise as commonly done in privacy-preserving scenarios, we adopt a stochastic message sender for each agent respectively and incorporate the DP requirement into the sender, which automatically adjusts the learned message distribution to alleviate the instability caused by DP noise. Further, we prove the existence of a Nash equilibrium in cooperative MARL with privacy-preserving communication, which suggests that this problem is game-theoretically learnable. Extensive experiments demonstrate a clear advantage of DPMAC over baseline methods in privacy-preserving scenarios.

View on arXiv PDF Code

Similar