RO NEApr 19, 2018

Hierarchical Behavioral Repertoires with Unsupervised Descriptors

arXiv:1804.07127v118.437 citations

Originality Highly original

AI Analysis

This addresses the problem of automating versatile behavior learning for robotics, with incremental improvements in efficiency and transferability.

The paper tackles enabling artificial agents to learn complex behaviors by proposing hierarchical behavioral repertoires that stack repertoires to generate sophisticated behaviors, reducing optimization dimensionality by orders of magnitude and achieving twice better fitness, and demonstrates transfer learning where a repertoire evolved for a robotic arm drawing digits can be adapted to a humanoid robot without retraining.

Enabling artificial agents to automatically learn complex, versatile and high-performing behaviors is a long-lasting challenge. This paper presents a step in this direction with hierarchical behavioral repertoires that stack several behavioral repertoires to generate sophisticated behaviors. Each repertoire of this architecture uses the lower repertoires to create complex behaviors as sequences of simpler ones, while only the lowest repertoire directly controls the agent's movements. This paper also introduces a novel approach to automatically define behavioral descriptors thanks to an unsupervised neural network that organizes the produced high-level behaviors. The experiments show that the proposed architecture enables a robot to learn how to draw digits in an unsupervised manner after having learned to draw lines and arcs. Compared to traditional behavioral repertoires, the proposed architecture reduces the dimensionality of the optimization problems by orders of magnitude and provides behaviors with a twice better fitness. More importantly, it enables the transfer of knowledge between robots: a hierarchical repertoire evolved for a robotic arm to draw digits can be transferred to a humanoid robot by simply changing the lowest layer of the hierarchy. This enables the humanoid to draw digits although it has never been trained for this task.

View on arXiv PDF

Similar