Improving Visual Perception of a Social Robot for Controlled and In-the-wild Human-robot Interaction
This addresses the trade-off between computational efficiency and interaction quality for social robots, but it is incremental as it applies existing models to a specific robot task.
The study tackled the problem of whether deep-learning visual perception models improve social robot interaction by comparing them to shallow-learning models in both controlled and in-the-wild settings, finding that they enhanced objective performance and subjective user experience.
Social robots often rely on visual perception to understand their users and the environment. Recent advancements in data-driven approaches for computer vision have demonstrated great potentials for applying deep-learning models to enhance a social robot's visual perception. However, the high computational demands of deep-learning methods, as opposed to the more resource-efficient shallow-learning models, bring up important questions regarding their effects on real-world interaction and user experience. It is unclear how will the objective interaction performance and subjective user experience be influenced when a social robot adopts a deep-learning based visual perception model. We employed state-of-the-art human perception and tracking models to improve the visual perception function of the Pepper robot and conducted a controlled lab study and an in-the-wild human-robot interaction study to evaluate this novel perception function for following a specific user with other people present in the scene.