CVJul 9, 2023

HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding

arXiv:2307.05721v132 citationsh-index: 37Has Code
Originality Synthesis-oriented
AI Analysis

This dataset addresses the need for better assembly knowledge understanding in industrial settings, though it is incremental as it focuses on creating a new dataset rather than a novel method.

The authors introduced HA-ViD, the first human assembly video dataset designed to understand comprehensive assembly knowledge, featuring 3222 multi-view videos with 1.5M frames and extensive annotations, and benchmarked it on four foundational video understanding tasks to analyze performance in areas like assembly progress and efficiency.

Understanding comprehensive assembly knowledge from videos is critical for futuristic ultra-intelligent industry. To enable technological breakthrough, we present HA-ViD - the first human assembly video dataset that features representative industrial assembly scenarios, natural procedural knowledge acquisition process, and consistent human-robot shared annotations. Specifically, HA-ViD captures diverse collaboration patterns of real-world assembly, natural human behaviors and learning progression during assembly, and granulate action annotations to subject, action verb, manipulated object, target object, and tool. We provide 3222 multi-view, multi-modality videos (each video contains one assembly task), 1.5M frames, 96K temporal labels and 2M spatial labels. We benchmark four foundational video understanding tasks: action recognition, action segmentation, object detection and multi-object tracking. Importantly, we analyze their performance for comprehending knowledge in assembly progress, process efficiency, task collaboration, skill parameters and human intention. Details of HA-ViD is available at: https://iai-hrc.github.io/ha-vid.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes