LGGTOct 10, 2023

Sample-Efficient Multi-Agent RL: An Optimization Perspective

arXiv:2310.06243v12 citationsh-index: 48
Originality Highly original
AI Analysis

This work addresses the challenge of sample efficiency in multi-agent reinforcement learning for researchers and practitioners, offering a more empirically implementable solution compared to prior methods.

The paper tackles the problem of sample-efficient multi-agent reinforcement learning in general-sum Markov Games by introducing the Multi-Agent Decoupling Coefficient (MADC) as a complexity measure, resulting in a unified algorithmic framework that ensures sample efficiency for learning various equilibria with low MADC and provides comparable sublinear regret.

We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation. In order to find the minimum assumption for sample-efficient learning, we introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs. Using this measure, we propose the first unified algorithmic framework that ensures sample efficiency in learning Nash Equilibrium, Coarse Correlated Equilibrium, and Correlated Equilibrium for both model-based and model-free MARL problems with low MADC. We also show that our algorithm provides comparable sublinear regret to the existing works. Moreover, our algorithm combines an equilibrium-solving oracle with a single objective optimization subprocedure that solves for the regularized payoff of each deterministic joint policy, which avoids solving constrained optimization problems within data-dependent constraints (Jin et al. 2020; Wang et al. 2023) or executing sampling procedures with complex multi-objective optimization problems (Foster et al. 2023), thus being more amenable to empirical implementation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes