E-React: Towards Emotionally Controlled Synthesis of Human Reactions
This work addresses the need for more natural and emotionally aware human motion synthesis in interactive tasks, representing an incremental improvement by adding emotion control to reaction generation.
The paper tackles the problem of generating human reaction motions that incorporate emotional cues, which existing frameworks ignore, and introduces a semi-supervised emotion prior in a diffusion model to achieve this, with experimental results showing it outperforms existing methods.
Emotion serves as an essential component in daily human interactions. Existing human motion generation frameworks do not consider the impact of emotions, which reduces naturalness and limits their application in interactive tasks, such as human reaction synthesis. In this work, we introduce a novel task: generating diverse reaction motions in response to different emotional cues. However, learning emotion representation from limited motion data and incorporating it into a motion generation framework remains a challenging problem. To address the above obstacles, we introduce a semi-supervised emotion prior in an actor-reactor diffusion model to facilitate emotion-driven reaction synthesis. Specifically, based on the observation that motion clips within a short sequence tend to share the same emotion, we first devise a semi-supervised learning framework to train an emotion prior. With this prior, we further train an actor-reactor diffusion model to generate reactions by considering both spatial interaction and emotional response. Finally, given a motion sequence of an actor, our approach can generate realistic reactions under various emotional conditions. Experimental results demonstrate that our model outperforms existing reaction generation methods. The code and data will be made publicly available at https://ereact.github.io/