ROCVFeb 28, 2024

A Multimodal Handover Failure Detection Dataset and Baselines

arXiv:2402.18319v27 citationsh-index: 17ICRA
AI Analysis

This work addresses a gap in robotics for human-robot interaction by providing a dataset and baselines for detecting human-induced handover failures, which is incremental as it builds on existing failure detection methods.

The paper tackles the problem of detecting handover failures caused by human participants, which are not addressed in existing datasets, by introducing a multimodal dataset and two baseline methods. The results indicate that video is crucial, but incorporating force-torque data and gripper position improves failure detection and action segmentation accuracy.

An object handover between a robot and a human is a coordinated action which is prone to failure for reasons such as miscommunication, incorrect actions and unexpected object properties. Existing works on handover failure detection and prevention focus on preventing failures due to object slip or external disturbances. However, there is a lack of datasets and evaluation methods that consider unpreventable failures caused by the human participant. To address this deficit, we present the multimodal Handover Failure Detection dataset, which consists of failures induced by the human participant, such as ignoring the robot or not releasing the object. We also present two baseline methods for handover failure detection: (i) a video classification method using 3D CNNs and (ii) a temporal action segmentation approach which jointly classifies the human action, robot action and overall outcome of the action. The results show that video is an important modality, but using force-torque data and gripper position help improve failure detection and action segmentation accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes