NEAILGOct 16, 2019

Parallel Exploration via Negatively Correlated Search

arXiv:1910.07151v214 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of exploration in reinforcement learning for AI agents, providing an incremental improvement with a mathematical foundation for parallel search methods.

The paper tackles the problem of effective exploration in search algorithms by proposing a more principled version of Negatively Correlated Search (NCS), which optimizes parallel exploration to maximize population diversity and solution qualities, achieving significant advantages over state-of-the-art methods in reinforcement learning tasks like playing Atari games.

Effective exploration is a key to successful search. The recently proposed Negatively Correlated Search (NCS) tries to achieve this by parallel exploration, where a set of search processes are driven to be negatively correlated so that different promising areas of the search space can be visited simultaneously. Various applications have verified the advantages of such novel search behaviors. Nevertheless, the mathematical understandings are still lacking as the previous NCS was mostly devised by intuition. In this paper, a more principled NCS is presented, explaining that the parallel exploration is equivalent to the explicit maximization of both the population diversity and the population solution qualities, and can be optimally obtained by partially gradient descending both models with respect to each search process. For empirical assessments, the reinforcement learning tasks that largely demand exploration ability is considered. The new NCS is applied to the popular reinforcement learning problems, i.e., playing Atari games, to directly train a deep convolution network with 1.7 million connection weights in the environments with uncertain and delayed rewards. Empirical results show that the significant advantages of NCS over the compared state-of-the-art methods can be highly owed to the effective parallel exploration ability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes