NE AIDec 7, 2022

Curiosity creates Diversity in Policy Search

Paul-Antoine Le Tolguenec, Emmanuel Rachelson, Yann Besse, Dennis G. Wilson

arXiv:2212.03530v22.72 citationsh-index: 4

Originality Incremental advance

AI Analysis

This is an incremental improvement for reinforcement learning in sparse-reward settings.

The paper tackled the problem of reward-sparse environments in policy search by using Curiosity as an intrinsic motivation metric in an evolutionary strategy, resulting in higher diversity and multiple reward-finding policies without explicit diversity criteria.

When searching for policies, reward-sparse environments often lack sufficient information about which behaviors to improve upon or avoid. In such environments, the policy search process is bound to blindly search for reward-yielding transitions and no early reward can bias this search in one direction or another. A way to overcome this is to use intrinsic motivation in order to explore new transitions until a reward is found. In this work, we use a recently proposed definition of intrinsic motivation, Curiosity, in an evolutionary policy search method. We propose Curiosity-ES, an evolutionary strategy adapted to use Curiosity as a fitness metric. We compare Curiosity with Novelty, a commonly used diversity metric, and find that Curiosity can generate higher diversity over full episodes without the need for an explicit diversity criterion and lead to multiple policies which find reward.

View on arXiv PDF

Similar