LGAIGTMar 3, 2016

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

arXiv:1603.01121v2455 citations
AI Analysis

This addresses the challenge of scalable strategy learning in complex real-world games like poker, offering a domain-agnostic alternative to expert-designed methods.

The paper tackled the problem of learning approximate Nash equilibria in large-scale imperfect-information games without prior domain knowledge, introducing Neural Fictitious Self-Play (NFSP) which approached Nash equilibrium in Leduc poker and achieved near state-of-the-art performance in Limit Texas Holdem.

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.

Code Implementations7 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes