MA AI GT LGFeb 9, 2021

Multi-Agent Coordination in Adversarial Environments through Signal Mediated Strategies

Federico Cacciamani, Andrea Celli, Marco Ciccone, Nicola Gatti

arXiv:2102.05026v14.39 citationsh-index: 25

Originality Incremental advance

AI Analysis

This work provides a method for multi-agent teams to coordinate in adversarial, imperfect-information settings, which is relevant for scenarios like card games or bidding, offering an incremental improvement over existing multi-agent RL algorithms.

This paper addresses multi-agent coordination in zero-sum, imperfect-information games where agents cannot communicate during play. The authors propose a game-theoretic centralized training regimen for trajectory sampling and a signaling-based framework, which empirically shows convergence to coordinated equilibria where previous state-of-the-art multi-agent RL algorithms failed.

Many real-world scenarios involve teams of agents that have to coordinate their actions to reach a shared goal. We focus on the setting in which a team of agents faces an opponent in a zero-sum, imperfect-information game. Team members can coordinate their strategies before the beginning of the game, but are unable to communicate during the playing phase of the game. This is the case, for example, in Bridge, collusion in poker, and collusion in bidding. In this setting, model-free RL methods are oftentimes unable to capture coordination because agents' policies are executed in a decentralized fashion. Our first contribution is a game-theoretic centralized training regimen to effectively perform trajectory sampling so as to foster team coordination. When team members can observe each other actions, we show that this approach provably yields equilibrium strategies. Then, we introduce a signaling-based framework to represent team coordinated strategies given a buffer of past experiences. Each team member's policy is parametrized as a neural network whose output is conditioned on a suitable exogenous signal, drawn from a learned probability distribution. By combining these two elements, we empirically show convergence to coordinated equilibria in cases where previous state-of-the-art multi-agent RL algorithms did not.

View on arXiv PDF

Similar