Teaching Robots to Span the Space of Functional Expressive Motion
This addresses the challenge of inefficient and non-generalizable emotion modeling in robotics, offering a more scalable approach for human-robot interaction.
The paper tackles the problem of enabling robots to perform functional tasks with expressive motions by learning a single model that maps trajectories to a latent Valence-Arousal-Dominance (VAD) space, allowing generalization to any emotion instead of predefined ones, and demonstrates this in simulation and user studies with robots like a vacuum and Cassie biped.
Our goal is to enable robots to perform functional tasks in emotive ways, be it in response to their users' emotional states, or expressive of their confidence levels. Prior work has proposed learning independent cost functions from user feedback for each target emotion, so that the robot may optimize it alongside task and environment specific objectives for any situation it encounters. However, this approach is inefficient when modeling multiple emotions and unable to generalize to new ones. In this work, we leverage the fact that emotions are not independent of each other: they are related through a latent space of Valence-Arousal-Dominance (VAD). Our key idea is to learn a model for how trajectories map onto VAD with user labels. Considering the distance between a trajectory's mapping and a target VAD allows this single model to represent cost functions for all emotions. As a result 1) all user feedback can contribute to learning about every emotion; 2) the robot can generate trajectories for any emotion in the space instead of only a few predefined ones; and 3) the robot can respond emotively to user-generated natural language by mapping it to a target VAD. We introduce a method that interactively learns to map trajectories to this latent space and test it in simulation and in a user study. In experiments, we use a simple vacuum robot as well as the Cassie biped.