ROHCLGSep 15, 2025

Learning to Generate Pointing Gestures in Situated Embodied Conversational Agents

arXiv:2509.12507v117 citationsh-index: 36Front Robot AI
Originality Incremental advance
AI Analysis

This addresses the need for non-verbal communication in robotics and intelligent agents to enable more flexible human interaction, representing an incremental improvement over existing methods.

The paper tackled the problem of generating natural pointing gestures for embodied conversational agents by combining imitation and reinforcement learning, achieving higher naturalness and accuracy than state-of-the-art supervised models in evaluations including a virtual reality referential game.

One of the main goals of robotics and intelligent agent research is to enable natural communication with humans in physically situated settings. While recent work has focused on verbal modes such as language and speech, non-verbal communication is crucial for flexible interaction. We present a framework for generating pointing gestures in embodied agents by combining imitation and reinforcement learning. Using a small motion capture dataset, our method learns a motor control policy that produces physically valid, naturalistic gestures with high referential accuracy. We evaluate the approach against supervised learning and retrieval baselines in both objective metrics and a virtual reality referential game with human users. Results show that our system achieves higher naturalness and accuracy than state-of-the-art supervised models, highlighting the promise of imitation-RL for communicative gesture generation and its potential application to robots.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes