CVMar 3, 2022

HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction

Yunze Liu, Yun Liu, Che Jiang, Kangbo Lyu, Weikang Wan, Hao Shen, Boqiang Liang, Zhoujie Fu, He Wang, Li Yi

arXiv:2203.01577v440.4366 citationsh-index: 79Has Code

Originality Synthesis-oriented

AI Analysis

This dataset addresses the need for category-level human-object interaction research in 4D egocentric vision, providing a resource for tasks like semantic segmentation and object pose tracking, but it is incremental as it builds on prior datasets by adding 4D and category-level aspects.

The authors tackled the problem of understanding human-object interactions in 4D by introducing HOI4D, a large-scale dataset with 2.4M RGB-D frames and rich annotations, which established three benchmarking tasks and revealed challenges for existing methods.

We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction. HOI4D consists of 2.4M RGB-D egocentric video frames over 4000 sequences collected by 4 participants interacting with 800 different object instances from 16 categories over 610 different indoor rooms. Frame-wise annotations for panoptic segmentation, motion segmentation, 3D hand pose, category-level object pose and hand action have also been provided, together with reconstructed object meshes and scene point clouds. With HOI4D, we establish three benchmarking tasks to promote category-level HOI from 4D visual signals including semantic segmentation of 4D dynamic point cloud sequences, category-level object pose tracking, and egocentric action segmentation with diverse interaction targets. In-depth analysis shows HOI4D poses great challenges to existing methods and produces great research opportunities.

View on arXiv PDF Code

Similar