CVApr 2, 2024

Disentangled Pre-training for Human-Object Interaction Detection

arXiv:2404.01725v118 citationsh-index: 4Has CodeCVPR
Originality Incremental advance
AI Analysis

This work addresses data scarcity in HOI detection, an incremental improvement for computer vision applications.

The paper tackles the problem of limited supervised data for human-object interaction (HOI) detection by proposing a disentangled pre-training method that uses object detection and action recognition datasets to pre-train decoder layers separately, resulting in significant performance enhancements on rare categories.

Detecting human-object interaction (HOI) has long been limited by the amount of supervised data available. Recent approaches address this issue by pre-training according to pseudo-labels, which align object regions with HOI triplets parsed from image captions. However, pseudo-labeling is tricky and noisy, making HOI pre-training a complex process. Therefore, we propose an efficient disentangled pre-training method for HOI detection (DP-HOI) to address this problem. First, DP-HOI utilizes object detection and action recognition datasets to pre-train the detection and interaction decoder layers, respectively. Then, we arrange these decoder layers so that the pre-training architecture is consistent with the downstream HOI detection task. This facilitates efficient knowledge transfer. Specifically, the detection decoder identifies reliable human instances in each action recognition dataset image, generates one corresponding query, and feeds it into the interaction decoder for verb classification. Next, we combine the human instance verb predictions in the same image and impose image-level supervision. The DP-HOI structure can be easily adapted to the HOI detection task, enabling effective model parameter initialization. Therefore, it significantly enhances the performance of existing HOI detection models on a broad range of rare categories. The code and pre-trained weight are available at https://github.com/xingaoli/DP-HOI.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes