LGMLJun 30, 2020

Evaluating the Performance of Reinforcement Learning Algorithms

arXiv:2006.16958v259 citations
Originality Synthesis-oriented
AI Analysis

This addresses reproducibility issues for researchers in reinforcement learning, though it is incremental as it focuses on improving evaluation metrics rather than introducing new algorithms.

The paper tackles the problem of inconsistent and non-reproducible performance results in reinforcement learning by proposing a new comprehensive evaluation methodology that produces reliable measurements, demonstrating it on standard benchmark tasks.

Performance evaluations are critical for quantifying algorithmic advances in reinforcement learning. Recent reproducibility analyses have shown that reported performance results are often inconsistent and difficult to replicate. In this work, we argue that the inconsistency of performance stems from the use of flawed evaluation metrics. Taking a step towards ensuring that reported results are consistent, we propose a new comprehensive evaluation methodology for reinforcement learning algorithms that produces reliable measurements of performance both on a single environment and when aggregated across environments. We demonstrate this method by evaluating a broad class of reinforcement learning algorithms on standard benchmark tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes