LGAIMLJun 11, 2018

The Potential of the Return Distribution for Exploration in RL

arXiv:1806.04242v210 citations
Originality Incremental advance
AI Analysis

This addresses exploration challenges in RL for researchers, but it appears incremental as it builds on existing distributional methods.

The paper tackles exploration in deterministic reinforcement learning by studying return distributions and their network losses, achieving a solution for a randomized Chain task of length 100, which was previously unreported with neural networks.

This paper studies the potential of the return distribution for exploration in deterministic reinforcement learning (RL) environments. We study network losses and propagation mechanisms for Gaussian, Categorical and Gaussian mixture distributions. Combined with exploration policies that leverage this return distribution, we solve, for example, a randomized Chain task of length 100, which has not been reported before when learning with neural networks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes