MA AI LGFeb 20, 2023

Differentiable Arbitrating in Zero-sum Markov Games

Jing Wang, Meichen Song, Feng Gao, Boyi Liu, Zhaoran Wang, Yi Wu

arXiv:2302.10058v13.32 citationsh-index: 40

Originality Incremental advance

AI Analysis

This work addresses a challenge in multi-agent systems for researchers and practitioners by enabling end-to-end optimization in game theory, though it appears incremental as it builds on existing Nash equilibrium solvers.

The paper tackles the problem of perturbing rewards in zero-sum Markov games to induce desirable Nash equilibria, proposing a differentiable backpropagation scheme that uses black-box solvers and demonstrates empirical success in multi-agent reinforcement learning environments.

We initiate the study of how to perturb the reward in a zero-sum Markov game with two players to induce a desirable Nash equilibrium, namely arbitrating. Such a problem admits a bi-level optimization formulation. The lower level requires solving the Nash equilibrium under a given reward function, which makes the overall problem challenging to optimize in an end-to-end way. We propose a backpropagation scheme that differentiates through the Nash equilibrium, which provides the gradient feedback for the upper level. In particular, our method only requires a black-box solver for the (regularized) Nash equilibrium (NE). We develop the convergence analysis for the proposed framework with proper black-box NE solvers and demonstrate the empirical successes in two multi-agent reinforcement learning (MARL) environments.

View on arXiv PDF

Similar