AI LGOct 5, 2022

Atari-5: Distilling the Arcade Learning Environment down to Five Games

Matthew Aitchison, Penny Sweetser, Marcus Hutter

arXiv:2210.02019v127.446 citationsh-index: 45Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of computational expense and reproducibility for researchers using the ALE benchmark, offering an incremental improvement by enabling faster and more feasible evaluations.

The paper tackles the high computational cost of evaluating reinforcement learning algorithms on the full 57-game Arcade Learning Environment (ALE) by proposing a method to select small, representative subsets, resulting in a 5-game subset (Atari-5) that estimates median scores within 10% of true values and a 10-game subset capturing 80% of variance.

The Arcade Learning Environment (ALE) has become an essential benchmark for assessing the performance of reinforcement learning algorithms. However, the computational cost of generating results on the entire 57-game dataset limits ALE's use and makes the reproducibility of many results infeasible. We propose a novel solution to this problem in the form of a principled methodology for selecting small but representative subsets of environments within a benchmark suite. We applied our method to identify a subset of five ALE games, called Atari-5, which produces 57-game median score estimates within 10% of their true values. Extending the subset to 10-games recovers 80% of the variance for log-scores for all games within the 57-game set. We show this level of compression is possible due to a high degree of correlation between many of the games in ALE.

View on arXiv PDF Code

Similar