LGAIAug 7, 2023

AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning

arXiv:2308.03526v115 citationsh-index: 102
Originality Incremental advance
AI Analysis

This work addresses the problem of advancing offline RL algorithms for complex simulated environments like StarCraft II, though it is incremental as it builds on existing datasets and methods.

The paper tackled the challenge of offline reinforcement learning in StarCraft II by creating a benchmark with a dataset, tools, and evaluation protocol, and achieved a 90% win rate against a prior agent using only offline data.

StarCraft II is one of the most challenging simulated reinforcement learning environments; it is partially observable, stochastic, multi-agent, and mastering StarCraft II requires strategic planning over long time horizons with real-time low-level execution. It also has an active professional competitive scene. StarCraft II is uniquely suited for advancing offline RL algorithms, both because of its challenging nature and because Blizzard has released a massive dataset of millions of StarCraft II games played by human players. This paper leverages that and establishes a benchmark, called AlphaStar Unplugged, introducing unprecedented challenges for offline reinforcement learning. We define a dataset (a subset of Blizzard's release), tools standardizing an API for machine learning methods, and an evaluation protocol. We also present baseline agents, including behavior cloning, offline variants of actor-critic and MuZero. We improve the state of the art of agents using only offline data, and we achieve 90% win rate against previously published AlphaStar behavior cloning agent.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes