NE AIJun 20, 2023

Evolutionary Strategy Guided Reinforcement Learning via MultiBuffer Communication

Adam Callaghan, Karl Mason, Patrick Mannion

arXiv:2306.11535v14.93 citationsh-index: 15

Originality Incremental advance

AI Analysis

This addresses control problems in robotics or simulation domains, but it is incremental as it builds on existing evolutionary reinforcement learning methods.

The paper tackles the problem of combining evolutionary strategies with deep reinforcement learning by introducing a multi-buffer system to prevent overpopulation of poor trajectories, resulting in competitive performance that outperforms CEM-RL on 3 out of 4 MuJoCo control tasks.

Evolutionary Algorithms and Deep Reinforcement Learning have both successfully solved control problems across a variety of domains. Recently, algorithms have been proposed which combine these two methods, aiming to leverage the strengths and mitigate the weaknesses of both approaches. In this paper we introduce a new Evolutionary Reinforcement Learning model which combines a particular family of Evolutionary algorithm called Evolutionary Strategies with the off-policy Deep Reinforcement Learning algorithm TD3. The framework utilises a multi-buffer system instead of using a single shared replay buffer. The multi-buffer system allows for the Evolutionary Strategy to search freely in the search space of policies, without running the risk of overpopulating the replay buffer with poorly performing trajectories which limit the number of desirable policy behaviour examples thus negatively impacting the potential of the Deep Reinforcement Learning within the shared framework. The proposed algorithm is demonstrated to perform competitively with current Evolutionary Reinforcement Learning algorithms on MuJoCo control tasks, outperforming the well known state-of-the-art CEM-RL on 3 of the 4 environments tested.

View on arXiv PDF

Similar