LGGTMLNov 29, 2020

Minimax Sample Complexity for Turn-based Stochastic Game

arXiv:2011.14267v124 citations
AI Analysis

This work provides theoretical guarantees for multi-agent reinforcement learning algorithms, which is a significant problem for researchers and practitioners in the field.

This paper establishes that a plug-in solver approach achieves minimax sample complexity for turn-based stochastic games (TBSG). The authors demonstrate that an empirical Nash equilibrium strategy approximates the true Nash equilibrium, providing both problem-dependent and problem-independent bounds.

The empirical success of Multi-agent reinforcement learning is encouraging, while few theoretical guarantees have been revealed. In this work, we prove that the plug-in solver approach, probably the most natural reinforcement learning algorithm, achieves minimax sample complexity for turn-based stochastic game (TBSG). Specifically, we plan in an empirical TBSG by utilizing a `simulator' that allows sampling from arbitrary state-action pair. We show that the empirical Nash equilibrium strategy is an approximate Nash equilibrium strategy in the true TBSG and give both problem-dependent and problem-independent bound. We develop absorbing TBSG and reward perturbation techniques to tackle the complex statistical dependence. The key idea is artificially introducing a suboptimality gap in TBSG and then the Nash equilibrium strategy lies in a finite set.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes