ML LGSep 18, 2018

Switching Isotropic and Directional Exploration with Parameter Space Noise in Deep Reinforcement Learning

Izumi Karino, Kazutoshi Tanaka, Ryuma Niiyama, Yasuo Kuniyoshi

arXiv:1809.06570v21.91 citations

Originality Incremental advance

AI Analysis

This work addresses exploration challenges in deep reinforcement learning, particularly for sparse reward settings, though it appears incremental as it builds on prior parameter space noise methods.

The paper tackles the problem of exploration in deep reinforcement learning by proposing a parameter space noise method that switches between isotropic and directional exploration based on accumulated returns. The method achieves competitive performance on baseline tasks and shows better performance in sparse reward environments.

This paper proposes an exploration method for deep reinforcement learning based on parameter space noise. Recent studies have experimentally shown that parameter space noise results in better exploration than the commonly used action space noise. Previous methods devised a way to update the diagonal covariance matrix of a noise distribution and did not consider the direction of the noise vector and its correlation. In addition, fast updates of the noise distribution are required to facilitate policy learning. We propose a method that deforms the noise distribution according to the accumulated returns and the noises that have led to the returns. Moreover, this method switches isotropic exploration and directional exploration in parameter space with regard to obtained rewards. We validate our exploration strategy in the OpenAI Gym continuous environments and modified environments with sparse rewards. The proposed method achieves results that are competitive with a previous method at baseline tasks. Moreover, our approach exhibits better performance in sparse reward environments by exploration with the switching strategy.

View on arXiv PDF

Similar