LGMar 3, 2017

EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

arXiv:1703.01260v2162 citations
Originality Incremental advance
AI Analysis

This addresses exploration difficulties in high-dimensional observation spaces like raw images for reinforcement learning practitioners, though it is incremental as it builds on existing novelty detection methods.

The paper tackles the challenge of sparse reward problems in deep reinforcement learning by proposing a novelty detection algorithm based on discriminatively trained exemplar models, achieving competitive results and state-of-the-art performance on the vizDoom benchmark.

Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes. However, sparse reward problems remain a significant challenge. Exploration methods based on novelty detection have been particularly successful in such settings but typically require generative or predictive models of the observations, which can be difficult to train when the observations are very high-dimensional and complex, as in the case of raw images. We propose a novelty detection algorithm for exploration that is based entirely on discriminatively trained exemplar models, where classifiers are trained to discriminate each visited state against all others. Intuitively, novel states are easier to distinguish against other states seen during training. We show that this kind of discriminative modeling corresponds to implicit density estimation, and that it can be combined with count-based exploration to produce competitive results on a range of popular benchmark tasks, including state-of-the-art results on challenging egocentric observations in the vizDoom benchmark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes