Efficient Quality-Diversity Optimization through Diverse Quality Species
This addresses the need for more efficient and flexible QD methods in fields like robotics, though it is an incremental improvement over existing approaches.
The paper tackles the problem of Quality-Diversity (QD) optimization being limited by predefined archives and behaviors, proposing Diverse Quality Species (DQS) as an alternative that learns diverse, high-performing solutions through unsupervised skill discovery and gradient-based mutations, resulting in improved sample efficiency and performance compared to other QD algorithms.
A prevalent limitation of optimizing over a single objective is that it can be misguided, becoming trapped in local optimum. This can be rectified by Quality-Diversity (QD) algorithms, where a population of high-quality and diverse solutions to a problem is preferred. Most conventional QD approaches, for example, MAP-Elites, explicitly manage a behavioral archive where solutions are broken down into predefined niches. In this work, we show that a diverse population of solutions can be found without the limitation of needing an archive or defining the range of behaviors in advance. Instead, we break down solutions into independently evolving species and use unsupervised skill discovery to learn diverse, high-performing solutions. We show that this can be done through gradient-based mutations that take on an information theoretic perspective of jointly maximizing mutual information and performance. We propose Diverse Quality Species (DQS) as an alternative to archive-based QD algorithms. We evaluate it over several simulated robotic environments and show that it can learn a diverse set of solutions from varying species. Furthermore, our results show that DQS is more sample-efficient and performant when compared to other QD algorithms. Relevant code and hyper-parameters are available at: https://github.com/rwickman/NEAT_RL.