HCSep 14, 2020

Understanding Gesture and Speech Multimodal Interactions for Manipulation Tasks in Augmented Reality Using Unconstrained Elicitation

arXiv:2009.06591v332 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of designing more intuitive multimodal interfaces for AR manipulation tasks, though it is incremental as it builds on existing elicitation methods to provide specific insights.

The study investigated how users naturally combine gesture and speech for object manipulation in augmented reality, finding that gesture strokes align closely with speech onset (within 10 ms) and that variations in hand posture or syntax cause proposal disagreements, leading to aliasing recommendations to improve system capture of natural interactions.

This research establishes a better understanding of the syntax choices in speech interactions and of how speech, gesture, and multimodal gesture and speech interactions are produced by users in unconstrained object manipulation environments using augmented reality. The work presents a multimodal elicitation study conducted with 24 participants. The canonical referents for translation, rotation, and scale were used along with some abstract referents (create, destroy, and select). In this study time windows for gesture and speech multimodal interactions are developed using the start and stop times of gestures and speech as well as the stoke times for gestures. While gestures commonly precede speech by 81 ms we find that the stroke of the gesture is commonly within 10 ms of the start of speech. Indicating that the information content of a gesture and its co-occurring speech are well aligned to each other. Lastly, the trends across the most common proposals for each modality are examined. Showing that the disagreement between proposals is often caused by a variation of hand posture or syntax. Allowing us to present aliasing recommendations to increase the percentage of users' natural interactions captured by future multimodal interactive systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes