Learning Strategic Value and Cooperation in Multi-Player Stochastic Games through Side Payments
This provides a method for multi-agent systems to assess player contributions and incentivize cooperation through side payments, though it appears incremental as it builds on existing game theory concepts.
The paper tackles the problem of quantifying strategic value and enabling cooperation in multi-player stochastic games by extending the Harsanyi-Shapley value to this setting, showing it can be computed using generalized Q-learning algorithms and validating it empirically on grid-games with three or more players.
For general-sum, n-player, strategic games with transferable utility, the Harsanyi-Shapley value provides a computable method to both 1) quantify the strategic value of a player; and 2) make cooperation rational through side payments. We give a simple formula to compute the HS value in normal-form games. Next, we provide two methods to generalize the HS values to stochastic (or Markov) games, and show that one of them may be computed using generalized Q-learning algorithms. Finally, an empirical validation is performed on stochastic grid-games with three or more players. Source code is provided to compute HS values for both the normal-form and stochastic game setting.