LGGTMLJun 18, 2020

DREAM: Deep Regret minimization with Advantage baselines and Model-free learning

arXiv:2006.10410v262 citations
AI Analysis

This addresses the challenge of efficient strategy optimization in complex games without requiring a perfect simulator, which is incremental as it builds on regret-based methods but enhances practicality.

The paper tackles the problem of finding optimal strategies in imperfect-information multi-agent games by introducing DREAM, a deep reinforcement learning algorithm that converges to Nash Equilibrium in two-player zero-sum games and to extensive-form coarse correlated equilibrium in other games, achieving state-of-the-art performance among model-free algorithms and being competitive with simulator-based ones.

We introduce DREAM, a deep reinforcement learning algorithm that finds optimal strategies in imperfect-information games with multiple agents. Formally, DREAM converges to a Nash Equilibrium in two-player zero-sum games and to an extensive-form coarse correlated equilibrium in all other games. Our primary innovation is an effective algorithm that, in contrast to other regret-based deep learning algorithms, does not require access to a perfect simulator of the game to achieve good performance. We show that DREAM empirically achieves state-of-the-art performance among model-free algorithms in popular benchmark games, and is even competitive with algorithms that do use a perfect simulator.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes