LGNov 19, 2021

Fast and Data-Efficient Training of Rainbow: an Experimental Study on Atari

arXiv:2111.10247v116 citations
Originality Incremental advance
AI Analysis

This work addresses the computational inefficiency of RL algorithms for researchers and practitioners, making them more feasible for practical applications, though it is incremental as it builds on an existing method.

The paper tackles the high data and computational costs of the Rainbow reinforcement learning algorithm by proposing an improved version that reduces data usage by 20 times and training time to 7.5 hours on a single GPU while maintaining competitive performance on the Arcade Learning Environment.

Across the Arcade Learning Environment, Rainbow achieves a level of performance competitive with humans and modern RL algorithms. However, attaining this level of performance requires large amounts of data and hardware resources, making research in this area computationally expensive and use in practical applications often infeasible. This paper's contribution is threefold: We (1) propose an improved version of Rainbow, seeking to drastically reduce Rainbow's data, training time, and compute requirements while maintaining its competitive performance; (2) we empirically demonstrate the effectiveness of our approach through experiments on the Arcade Learning Environment, and (3) we conduct a number of ablation studies to investigate the effect of the individual proposed modifications. Our improved version of Rainbow reaches a median human normalized score close to classic Rainbow's, while using 20 times less data and requiring only 7.5 hours of training time on a single GPU. We also provide our full implementation including pre-trained models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes