Comprehensive Facial Expression Synthesis using Human-Interpretable Language
This work addresses the need for human-interpretable control in facial expression synthesis, which is incremental by building on existing action unit representations.
The paper tackles the problem of synthesizing facial expressions from natural language descriptions, enabling intuitive human control over detailed facial movements. The method effectively embeds language features into facial features, allowing individual words to control specific parts of facial movement.
Recent advances in facial expression synthesis have shown promising results using diverse expression representations including facial action units. Facial action units for an elaborate facial expression synthesis need to be intuitively represented for human comprehension, not a numeric categorization of facial action units. To address this issue, we utilize human-friendly approach: use of natural language where language helps human grasp conceptual contexts. In this paper, therefore, we propose a new facial expression synthesis model from language-based facial expression description. Our method can synthesize the facial image with detailed expressions. In addition, effectively embedding language features on facial features, our method can control individual word to handle each part of facial movement. Extensive qualitative and quantitative evaluations were conducted to verify the effectiveness of the natural language.