AIGTLGMANov 2, 2017

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

arXiv:1711.00832v2735 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of achieving robust and generalizable interactions in MARL, which is crucial for developing intelligent agents in shared environments, though it appears incremental by building on existing game-theoretic concepts.

The paper tackles the problem of overfitting in multiagent reinforcement learning (MARL) by introducing a unified game-theoretic approach that generalizes previous methods, demonstrating improved policy generality in partially observable settings like gridworld coordination games and poker.

To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes