GTLGJan 16, 2014

Policy Invariance under Reward Transformations for General-Sum Stochastic Games

arXiv:1401.3907v112 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of accelerating convergence in multi-agent learning for general-sum stochastic games, though it is incremental as it extends an existing method to a new setting.

The authors extended potential-based shaping from Markov decision processes to multi-player general-sum stochastic games, proving that Nash equilibria remain unchanged after applying this transformation, which can speed up convergence in learning.

We extend the potential-based shaping method from Markov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes