LG AI MLJun 11, 2018

The Potential of the Return Distribution for Exploration in RL

Thomas M. Moerland, Joost Broekens, Catholijn M. Jonker

arXiv:1806.04242v26.210 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses exploration challenges in RL for researchers, but it appears incremental as it builds on existing distributional methods.

The paper tackles exploration in deterministic reinforcement learning by studying return distributions and their network losses, achieving a solution for a randomized Chain task of length 100, which was previously unreported with neural networks.

This paper studies the potential of the return distribution for exploration in deterministic reinforcement learning (RL) environments. We study network losses and propagation mechanisms for Gaussian, Categorical and Gaussian mixture distributions. Combined with exploration policies that leverage this return distribution, we solve, for example, a randomized Chain task of length 100, which has not been reported before when learning with neural networks.

View on arXiv PDF Code

Similar