Enhancing MAP-Elites with Multiple Parallel Evolution Strategies
This work addresses the challenge of scaling QD algorithms for high-dimensional and uncertain settings, offering a more efficient method for researchers and practitioners in optimization and reinforcement learning, though it appears incremental as it builds on existing MAP-Elites and ES frameworks.
The authors tackled the problem of effectively using massively parallel evaluations in Quality-Diversity (QD) algorithms by proposing MAP-Elites-Multi-ES (MEMES), which maintains up to 100 simultaneous Evolution Strategies processes on a single GPU, and showed that it outperforms gradient-based, mutation-based, and sampling-based QD methods on black-box optimization and QD-Reinforcement-Learning tasks, including in uncertain domains.
With the development of fast and massively parallel evaluations in many domains, Quality-Diversity (QD) algorithms, that already proved promising in a large range of applications, have seen their potential multiplied. However, we have yet to understand how to best use a large number of evaluations as using them for random variations alone is not always effective. High-dimensional search spaces are a typical situation where random variations struggle to effectively search. Another situation is uncertain settings where solutions can appear better than they truly are and naively evaluating more solutions might mislead QD algorithms. In this work, we propose MAP-Elites-Multi-ES (MEMES), a novel QD algorithm based on Evolution Strategies (ES) designed to exploit fast parallel evaluations more effectively. MEMES maintains multiple (up to 100) simultaneous ES processes, each with its own independent objective and reset mechanism designed for QD optimisation, all on just a single GPU. We show that MEMES outperforms both gradient-based and mutation-based QD algorithms on black-box optimisation and QD-Reinforcement-Learning tasks, demonstrating its benefit across domains. Additionally, our approach outperforms sampling-based QD methods in uncertain domains when given the same evaluation budget. Overall, MEMES generates reproducible solutions that are high-performing and diverse through large-scale ES optimisation on easily accessible hardware.