A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents
This work addresses the problem of evaluating gesture generation in real-time interactions for researchers and developers of conversational agents, though it is incremental as it builds on existing methods without introducing new models.
The authors tackled the lack of real-time evaluation for modern gesture generation models in embodied conversational agents by developing an extensible open-source framework that integrates a 3D agent, chatbot backend, and gesticulating system, enabling testing of different models in interactive settings.
Embodied conversational agents (ECAs) benefit from non-verbal behavior for natural and efficient interaction with users. Gesticulation - hand and arm movements accompanying speech - is an essential part of non-verbal behavior. Gesture generation models have been developed for several decades: starting with rule-based and ending with mainly data-driven methods. To date, recent end-to-end gesture generation methods have not been evaluated in a real-time interaction with users. We present a proof-of-concept framework, which is intended to facilitate evaluation of modern gesture generation models in interaction. We demonstrate an extensible open-source framework that contains three components: 1) a 3D interactive agent; 2) a chatbot backend; 3) a gesticulating system. Each component can be replaced, making the proposed framework applicable for investigating the effect of different gesturing models in real-time interactions with different communication modalities, chatbot backends, or different agent appearances. The code and video are available at the project page https://nagyrajmund.github.io/project/gesturebot.