LGAIMLOct 30, 2018

Exploration by Random Network Distillation

arXiv:1810.12894v11658 citationsHas Code
Originality Highly original
AI Analysis

This addresses the challenge of exploration in reinforcement learning for AI agents, particularly in sparse-reward environments, representing a notable advance rather than an incremental improvement.

The paper tackled the problem of exploration in deep reinforcement learning by introducing an exploration bonus based on random network distillation, which enabled significant progress on hard exploration Atari games, achieving state-of-the-art performance on Montezuma's Revenge and occasionally completing the first level.

We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations given by a fixed randomly initialized neural network. We also introduce a method to flexibly combine intrinsic and extrinsic rewards. We find that the random network distillation (RND) bonus combined with this increased flexibility enables significant progress on several hard exploration Atari games. In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods. To the best of our knowledge, this is the first method that achieves better than average human performance on this game without using demonstrations or having access to the underlying state of the game, and occasionally completes the first level.

Code Implementations23 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes