GlovEgo-HOI: Bridging the Synthetic-to-Real Gap for Industrial Egocentric Human-Object Interaction Detection
This addresses the problem of industrial safety monitoring for workers by providing a dataset and method to improve detection, though it is incremental in combining existing techniques.
The paper tackles the scarcity of annotated data for industrial egocentric human-object interaction detection by introducing a data generation framework that combines synthetic data with diffusion-based augmentation for realistic PPE, and presents a new benchmark dataset and model. The results show effectiveness, with the dataset and models released to foster research.
Egocentric Human-Object Interaction (EHOI) analysis is crucial for industrial safety, yet the development of robust models is hindered by the scarcity of annotated domain-specific data. We address this challenge by introducing a data generation framework that combines synthetic data with a diffusion-based process to augment real-world images with realistic Personal Protective Equipment (PPE). We present GlovEgo-HOI, a new benchmark dataset for industrial EHOI, and GlovEgo-Net, a model integrating Glove-Head and Keypoint- Head modules to leverage hand pose information for enhanced interaction detection. Extensive experiments demonstrate the effectiveness of the proposed data generation framework and GlovEgo-Net. To foster further research, we release the GlovEgo-HOI dataset, augmentation pipeline, and pre-trained models at: GitHub project.