HCAIMMAug 26, 2023

The DiffuseStyleGesture+ entry to the GENEA Challenge 2023

arXiv:2308.13879v130 citationsh-index: 13Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for automated gesture generation in human-agent interaction, but it is incremental as it builds on existing methods to compete in a challenge.

The paper tackled the problem of generating realistic conversational gestures for embodied agents by proposing DiffuseStyleGesture+, a diffusion-based model that uses audio, text, speaker ID, and seed gestures as inputs, and it achieved performance on par with top-tier models in the GENEA Challenge 2023, showing no significant differences in human-likeness and appropriateness.

In this paper, we introduce the DiffuseStyleGesture+, our solution for the Generation and Evaluation of Non-verbal Behavior for Embodied Agents (GENEA) Challenge 2023, which aims to foster the development of realistic, automated systems for generating conversational gestures. Participants are provided with a pre-processed dataset and their systems are evaluated through crowdsourced scoring. Our proposed model, DiffuseStyleGesture+, leverages a diffusion model to generate gestures automatically. It incorporates a variety of modalities, including audio, text, speaker ID, and seed gestures. These diverse modalities are mapped to a hidden space and processed by a modified diffusion model to produce the corresponding gesture for a given speech input. Upon evaluation, the DiffuseStyleGesture+ demonstrated performance on par with the top-tier models in the challenge, showing no significant differences with those models in human-likeness, appropriateness for the interlocutor, and achieving competitive performance with the best model on appropriateness for agent speech. This indicates that our model is competitive and effective in generating realistic and appropriate gestures for given speech. The code, pre-trained models, and demos are available at https://github.com/YoungSeng/DiffuseStyleGesture/tree/DiffuseStyleGesturePlus/BEAT-TWH-main.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes