CVMar 3, 2022

HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction

arXiv:2203.01577v4341 citationsh-index: 22
Originality Synthesis-oriented
AI Analysis

This dataset addresses the need for category-level human-object interaction research in 4D egocentric vision, providing a resource for tasks like semantic segmentation and object pose tracking, but it is incremental as it builds on prior datasets by adding 4D and category-level aspects.

The authors tackled the problem of understanding human-object interactions in 4D by introducing HOI4D, a large-scale dataset with 2.4M RGB-D frames and rich annotations, which established three benchmarking tasks and revealed challenges for existing methods.

We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction. HOI4D consists of 2.4M RGB-D egocentric video frames over 4000 sequences collected by 4 participants interacting with 800 different object instances from 16 categories over 610 different indoor rooms. Frame-wise annotations for panoptic segmentation, motion segmentation, 3D hand pose, category-level object pose and hand action have also been provided, together with reconstructed object meshes and scene point clouds. With HOI4D, we establish three benchmarking tasks to promote category-level HOI from 4D visual signals including semantic segmentation of 4D dynamic point cloud sequences, category-level object pose tracking, and egocentric action segmentation with diverse interaction targets. In-depth analysis shows HOI4D poses great challenges to existing methods and produces great research opportunities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes