CVOct 19, 2021

Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction

Takuma Yagi, Md Tasnimul Hasan, Yoichi Sato

arXiv:2110.10174v13.77 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of costly annotation for hand-object contact prediction in videos, offering a semi-supervised solution that is incremental in improving data efficiency.

The paper tackles the problem of predicting contact states between hands and objects in videos, which is understudied in hand-object interaction analysis, and demonstrates superior performance on a new benchmark dataset.

Every hand-object interaction begins with contact. Despite predicting the contact state between hands and objects is useful in understanding hand-object interactions, prior methods on hand-object analysis have assumed that the interacting hands and objects are known, and were not studied in detail. In this study, we introduce a video-based method for predicting contact between a hand and an object. Specifically, given a video and a pair of hand and object tracks, we predict a binary contact state (contact or no-contact) for each frame. However, annotating a large number of hand-object tracks and contact labels is costly. To overcome the difficulty, we propose a semi-supervised framework consisting of (i) automatic collection of training data with motion-based pseudo-labels and (ii) guided progressive label correction (gPLC), which corrects noisy pseudo-labels with a small amount of trusted data. We validated our framework's effectiveness on a newly built benchmark dataset for hand-object contact prediction and showed superior performance against existing baseline methods. Code and data are available at https://github.com/takumayagi/hand_object_contact_prediction.

View on arXiv PDF Code

Similar