NEAIApr 23, 2024

Evolutionary Reinforcement Learning via Cooperative Coevolution

arXiv:2404.14763v32 citationsh-index: 5ECAI
Originality Incremental advance
AI Analysis

This addresses a scalability bottleneck in evolutionary reinforcement learning for domains like robotics, though it is an incremental improvement over existing methods.

The paper tackled the poor scalability of genetic operators in evolutionary reinforcement learning for high-dimensional neural networks by proposing CoERL, which uses cooperative coevolution and partial gradients, and it outperformed seven state-of-the-art algorithms on six benchmark locomotion tasks.

Recently, evolutionary reinforcement learning has obtained much attention in various domains. Maintaining a population of actors, evolutionary reinforcement learning utilises the collected experiences to improve the behaviour policy through efficient exploration. However, the poor scalability of genetic operators limits the efficiency of optimising high-dimensional neural networks.To address this issue, this paper proposes a novel cooperative coevolutionary reinforcement learning (CoERL) algorithm. Inspired by cooperative coevolution, CoERL periodically and adaptively decomposes the policy optimisation problem into multiple subproblems and evolves a population of neural networks for each of the subproblems. Instead of using genetic operators, CoERL directly searches for partial gradients to update the policy. Updating policy with partial gradients maintains consistency between the behaviour spaces of parents and offspring across generations.The experiences collected by the population are then used to improve the entire policy, which enhances the sampling efficiency.Experiments on six benchmark locomotion tasks demonstrate that CoERL outperforms seven state-of-the-art algorithms and baselines.Ablation study verifies the unique contribution of CoERL's core ingredients.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes