MA GT LGJun 12, 2025

Shapley Machine: A Game-Theoretic Framework for N-Agent Ad Hoc Teamwork

Jianhong Wang, Yang Li, Samuel Kaski, Jonathan Lawry

arXiv:2506.11285v12.31 citationsh-index: 3Has Code

Originality Highly original

AI Analysis

It addresses credit assignment in multi-agent systems for applications like smart grids and swarm robotics, offering a novel theoretical framework.

The paper tackles the n-agent ad hoc teamwork problem in open multi-agent systems by modeling it with cooperative game theory and deriving Shapley values for credit assignment, proposing a TD(λ)-like RL algorithm called Shapley Machine that shows effectiveness in experiments.

Open multi-agent systems are increasingly important in modeling real-world applications, such as smart grids, swarm robotics, etc. In this paper, we aim to investigate a recently proposed problem for open multi-agent systems, referred to as n-agent ad hoc teamwork (NAHT), where only a number of agents are controlled. Existing methods tend to be based on heuristic design and consequently lack theoretical rigor and ambiguous credit assignment among agents. To address these limitations, we model and solve NAHT through the lens of cooperative game theory. More specifically, we first model an open multi-agent system, characterized by its value, as an instance situated in a space of cooperative games, generated by a set of basis games. We then extend this space, along with the state space, to accommodate dynamic scenarios, thereby characterizing NAHT. Exploiting the justifiable assumption that basis game values correspond to a sequence of n-step returns with different horizons, we represent the state values for NAHT in a form similar to $λ$-returns. Furthermore, we derive Shapley values to allocate state values to the controlled agents, as credits for their contributions to the ad hoc team. Different from the conventional approach to shaping Shapley values in an explicit form, we shape Shapley values by fulfilling the three axioms uniquely describing them, well defined on the extended game space describing NAHT. To estimate Shapley values in dynamic scenarios, we propose a TD($λ$)-like algorithm. The resulting reinforcement learning (RL) algorithm is referred to as Shapley Machine. To our best knowledge, this is the first time that the concepts from cooperative game theory are directly related to RL concepts. In experiments, we demonstrate the effectiveness of Shapley Machine and verify reasonableness of our theory.

View on arXiv PDF Code

Similar