AIGTFeb 5, 2024

Mastering Zero-Shot Interactions in Cooperative and Competitive Simultaneous Games

arXiv:2402.03136v21 citationsh-index: 8ICML
AI Analysis

This work addresses the problem of zero-shot interactions in simultaneous games for AI agents, offering a novel method that is incremental but provides strong specific gains in cooperative and competitive settings.

The paper tackled the challenge of adapting self-play and planning algorithms like AlphaZero to simultaneous games, where missing information about concurrent actions limits performance, by proposing Albatross, which learns a Smooth Best Response Logit Equilibrium to enable interactions with agents of any strength, resulting in a 37.6% improvement over previous state-of-the-art in the cooperative Overcooked benchmark and effective exploitation of weak agents in Battlesnake.

The combination of self-play and planning has achieved great successes in sequential games, for instance in Chess and Go. However, adapting algorithms such as AlphaZero to simultaneous games poses a new challenge. In these games, missing information about concurrent actions of other agents is a limiting factor as they may select different Nash equilibria or do not play optimally at all. Thus, it is vital to model the behavior of the other agents when interacting with them in simultaneous games. To this end, we propose Albatross: AlphaZero for Learning Bounded-rational Agents and Temperature-based Response Optimization using Simulated Self-play. Albatross learns to play the novel equilibrium concept of a Smooth Best Response Logit Equilibrium (SBRLE), which enables cooperation and competition with agents of any playing strength. We perform an extensive evaluation of Albatross on a set of cooperative and competitive simultaneous perfect-information games. In contrast to AlphaZero, Albatross is able to exploit weak agents in the competitive game of Battlesnake. Additionally, it yields an improvement of 37.6% compared to previous state of the art in the cooperative Overcooked benchmark.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes