ROHCJan 26, 2018

A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction

arXiv:1801.08760v138 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a specific disambiguation challenge in human-robot interaction, but it is incremental as it compares existing visualisation methods rather than introducing a new approach.

The paper tackled the problem of disambiguating verbal requests in human-robot interaction when multiple objects match a description, comparing three visualisation methods (mixed reality, augmented reality, and a monitor) in a controlled experiment with a YuMi robot. The results showed significant differences in accuracy and engagement between conditions, with no differences in task time, and participants preferred augmented reality overall.

Picking up objects requested by a human user is a common task in human-robot interaction. When multiple objects match the user's verbal description, the robot needs to clarify which object the user is referring to before executing the action. Previous research has focused on perceiving user's multimodal behaviour to complement verbal commands or minimising the number of follow up questions to reduce task time. In this paper, we propose a system for reference disambiguation based on visualisation and compare three methods to disambiguate natural language instructions. In a controlled experiment with a YuMi robot, we investigated real-time augmentations of the workspace in three conditions -- mixed reality, augmented reality, and a monitor as the baseline -- using objective measures such as time and accuracy, and subjective measures like engagement, immersion, and display interference. Significant differences were found in accuracy and engagement between the conditions, but no differences were found in task time. Despite the higher error rates in the mixed reality condition, participants found that modality more engaging than the other two, but overall showed preference for the augmented reality condition over the monitor and mixed reality conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes