Deep Exploration via Bootstrapped DQN
This addresses the challenge of exploration in reinforcement learning for AI agents, offering a novel method that is not incremental.
The paper tackled efficient exploration in complex reinforcement learning environments by proposing bootstrapped DQN, which uses randomized value functions for deep exploration, leading to exponentially faster learning and substantial performance improvements across most Atari games.
Efficient exploration in complex environments remains a major challenge for reinforcement learning. We propose bootstrapped DQN, a simple algorithm that explores in a computationally and statistically efficient manner through use of randomized value functions. Unlike dithering strategies such as epsilon-greedy exploration, bootstrapped DQN carries out temporally-extended (or deep) exploration; this can lead to exponentially faster learning. We demonstrate these benefits in complex stochastic MDPs and in the large-scale Arcade Learning Environment. Bootstrapped DQN substantially improves learning times and performance across most Atari games.