Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
This addresses the problem of exploration in large state spaces and sparse rewards for reinforcement learning practitioners, though it appears incremental as it builds on existing algorithms.
The paper tackles the challenge of efficient exploration in reinforcement learning by introducing a diversity-driven approach that adds a distance measure to the loss function, resulting in improved mean scores and exploration efficiency in Atari 2600 tasks.
Efficient exploration remains a challenging research problem in reinforcement learning, especially when an environment contains large state spaces, deceptive local optima, or sparse rewards. To tackle this problem, we present a diversity-driven approach for exploration, which can be easily combined with both off- and on-policy reinforcement learning algorithms. We show that by simply adding a distance measure to the loss function, the proposed methodology significantly enhances an agent's exploratory behaviors, and thus preventing the policy from being trapped in local optima. We further propose an adaptive scaling method for stabilizing the learning process. Our experimental results in Atari 2600 show that our method outperforms baseline approaches in several tasks in terms of mean scores and exploration efficiency.