Dynamics-Aware Quality-Diversity for Efficient Learning of Skill Repertoires
This addresses the problem of high computational costs for robotics researchers and practitioners, offering an incremental improvement over existing QD methods.
The paper tackles the sample inefficiency of Quality-Diversity (QD) algorithms for robotic skill discovery by proposing Dynamics-Aware QD (DA-QD), which uses dynamics models to achieve 20 times greater sample efficiency and enables zero-shot learning of new skill repertoires.
Quality-Diversity (QD) algorithms are powerful exploration algorithms that allow robots to discover large repertoires of diverse and high-performing skills. However, QD algorithms are sample inefficient and require millions of evaluations. In this paper, we propose Dynamics-Aware Quality-Diversity (DA-QD), a framework to improve the sample efficiency of QD algorithms through the use of dynamics models. We also show how DA-QD can then be used for continual acquisition of new skill repertoires. To do so, we incrementally train a deep dynamics model from experience obtained when performing skill discovery using QD. We can then perform QD exploration in imagination with an imagined skill repertoire. We evaluate our approach on three robotic experiments. First, our experiments show DA-QD is 20 times more sample efficient than existing QD approaches for skill discovery. Second, we demonstrate learning an entirely new skill repertoire in imagination to perform zero-shot learning. Finally, we show how DA-QD is useful and effective for solving a long horizon navigation task and for damage adaptation in the real world. Videos and source code are available at: https://sites.google.com/view/da-qd.