LGMLDec 13, 2019

Dota 2 with Large Scale Deep Reinforcement Learning

OpenAI
arXiv:1912.06680v12142 citations
Originality Incremental advance
AI Analysis

This demonstrates that self-play reinforcement learning can solve difficult real-world tasks, advancing AI capabilities in complex environments.

OpenAI Five tackled the challenge of mastering the complex esports game Dota 2, which involves long time horizons and imperfect information, and achieved superhuman performance by defeating the world champion team after 10 months of training.

On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions at an esports game. The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges which will become increasingly central to more capable AI systems. OpenAI Five leveraged existing reinforcement learning techniques, scaled to learn from batches of approximately 2 million frames every 2 seconds. We developed a distributed training system and tools for continual training which allowed us to train OpenAI Five for 10 months. By defeating the Dota 2 world champion (Team OG), OpenAI Five demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes