CVApr 1, 2024

A Unified and Interpretable Emotion Representation and Expression Generation

arXiv:2404.01243v110 citationsh-index: 30CVPR
Originality Incremental advance
AI Analysis

This work addresses the need for a unified and interpretable emotion representation for fine-grained expression generation in AI and human-computer interaction, offering a new label space for emotions.

The paper tackled the problem of unifying four emotion modalities (Canonical, Compound, Action Units, and Arousal-Valence) into an interpretable model called C2A2, and developed a method to generate continuous facial expressions using modified diffusion models, resulting in images that capture subtle expressions as shown through quantitative and qualitative experiments.

Canonical emotions, such as happy, sad, and fearful, are easy to understand and annotate. However, emotions are often compound, e.g. happily surprised, and can be mapped to the action units (AUs) used for expressing emotions, and trivially to the canonical ones. Intuitively, emotions are continuous as represented by the arousal-valence (AV) model. An interpretable unification of these four modalities - namely, Canonical, Compound, AUs, and AV - is highly desirable, for a better representation and understanding of emotions. However, such unification remains to be unknown in the current literature. In this work, we propose an interpretable and unified emotion model, referred as C2A2. We also develop a method that leverages labels of the non-unified models to annotate the novel unified one. Finally, we modify the text-conditional diffusion models to understand continuous numbers, which are then used to generate continuous expressions using our unified emotion model. Through quantitative and qualitative experiments, we show that our generated images are rich and capture subtle expressions. Our work allows a fine-grained generation of expressions in conjunction with other textual inputs and offers a new label space for emotions at the same time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes