LG AI MLJun 16, 2022

BYOL-Explore: Exploration by Bootstrapped Prediction

Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pîslar, Bernardo Avila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos

arXiv:2206.08332v129.796 citationsh-index: 88

Originality Highly original

AI Analysis

This addresses the challenge of exploration in reinforcement learning for AI agents in complex environments, representing a significant advance over prior methods.

The paper tackles the problem of curiosity-driven exploration in visually-complex environments by introducing BYOL-Explore, a method that learns world representation, dynamics, and policy through a single prediction loss, achieving superhuman performance on the ten hardest Atari games and solving most tasks in the DM-HARD-8 benchmark without human demonstrations.

We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven exploration in visually-complex environments. BYOL-Explore learns a world representation, the world dynamics, and an exploration policy all-together by optimizing a single prediction loss in the latent space with no additional auxiliary objective. We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark with visually-rich 3-D environments. On this benchmark, we solve the majority of the tasks purely through augmenting the extrinsic reward with BYOL-Explore s intrinsic reward, whereas prior work could only get off the ground with human demonstrations. As further evidence of the generality of BYOL-Explore, we show that it achieves superhuman performance on the ten hardest exploration games in Atari while having a much simpler design than other competitive agents.

View on arXiv PDF

Similar