LGAIROAug 21, 2023

To Whom are You Talking? A Deep Learning Model to Endow Social Robots with Addressee Estimation Skills

arXiv:2308.10757v26 citationsh-index: 48
Originality Incremental advance
AI Analysis

This addresses the challenge of enabling social robots to understand human communication dynamics, though it appears incremental as it builds on existing methods for addressee estimation.

The paper tackles the problem of Addressee Estimation for social robots by interpreting non-verbal bodily cues from speakers, using a hybrid deep learning model with convolutional layers and LSTM cells that achieves addressee localization in space from a robot ego-centric perspective.

Communicating shapes our social word. For a robot to be considered social and being consequently integrated in our social environment it is fundamental to understand some of the dynamics that rule human-human communication. In this work, we tackle the problem of Addressee Estimation, the ability to understand an utterance's addressee, by interpreting and exploiting non-verbal bodily cues from the speaker. We do so by implementing an hybrid deep learning model composed of convolutional layers and LSTM cells taking as input images portraying the face of the speaker and 2D vectors of the speaker's body posture. Our implementation choices were guided by the aim to develop a model that could be deployed on social robots and be efficient in ecological scenarios. We demonstrate that our model is able to solve the Addressee Estimation problem in terms of addressee localisation in space, from a robot ego-centric point of view.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes