GTLGJul 13, 2023

Multi-Player Zero-Sum Markov Games with Networked Separable Interactions

arXiv:2307.09470v315 citationsh-index: 68
Originality Incremental advance
AI Analysis

This work addresses non-cooperative multi-agent systems with local interactions, providing theoretical insights and algorithms, though it is incremental in extending existing game theory concepts to networked structures.

The paper tackles the problem of modeling local interactions in multi-agent sequential decision-making by introducing a new class of zero-sum Markov games with networked separable interactions, showing that Markov coarse correlated equilibria collapse to Nash equilibria in these games and proving computational hardness for finding approximate stationary equilibria, with convergence guarantees for specific network structures and algorithms achieving finite-iteration results.

We study a new class of Markov games, \emph(multi-player) zero-sum Markov Games} with \emph{Networked separable interactions} (zero-sum NMGs), to model the local interaction structure in non-cooperative multi-agent sequential decision-making. We define a zero-sum NMG as a model where {the payoffs of the auxiliary games associated with each state are zero-sum and} have some separable (i.e., polymatrix) structure across the neighbors over some interaction network. We first identify the necessary and sufficient conditions under which an MG can be presented as a zero-sum NMG, and show that the set of Markov coarse correlated equilibrium (CCE) collapses to the set of Markov Nash equilibrium (NE) in these games, in that the product of per-state marginalization of the former for all players yields the latter. Furthermore, we show that finding approximate Markov \emph{stationary} CCE in infinite-horizon discounted zero-sum NMGs is \texttt{PPAD}-hard, unless the underlying network has a ``star topology''. Then, we propose fictitious-play-type dynamics, the classical learning dynamics in normal-form games, for zero-sum NMGs, and establish convergence guarantees to Markov stationary NE under a star-shaped network structure. Finally, in light of the hardness result, we focus on computing a Markov \emph{non-stationary} NE and provide finite-iteration guarantees for a series of value-iteration-based algorithms. We also provide numerical experiments to corroborate our theoretical results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes