LGAIMLApr 12, 2019

Let's Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments

arXiv:1904.06312v124 citations
Originality Synthesis-oriented
AI Analysis

This addresses reproducibility issues for researchers in reinforcement learning, though it is incremental as it focuses on improving reporting practices rather than introducing new methods.

The paper tackles the problem of reproducibility in reinforcement learning by demonstrating high variability in the performance of common agents from the OpenAI Baselines repository across multiple runs, and argues for reporting performance as distributions instead of point estimates to better capture this variability.

Reproducibility in reinforcement learning is challenging: uncontrolled stochasticity from many sources, such as the learning algorithm, the learned policy, and the environment itself have led researchers to report the performance of learned agents using aggregate metrics of performance over multiple random seeds for a single environment. Unfortunately, there are still pernicious sources of variability in reinforcement learning agents that make reporting common summary statistics an unsound metric for performance. Our experiments demonstrate the variability of common agents used in the popular OpenAI Baselines repository. We make the case for reporting post-training agent performance as a distribution, rather than a point estimate.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes