CVAug 5, 2020

Pose-based Modular Network for Human-Object Interaction Detection

Zhijun Liang, Junfa Liu, Yisheng Guan, Juan Rojas

arXiv:2008.02042v15.015 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses scene understanding for computer vision applications, but it is incremental as it builds on existing networks with modular enhancements.

The paper tackled human-object interaction detection by leveraging human pose and spatial information, resulting in significant improvements on V-COCO and HICO-DET benchmarks when combined with a state-of-the-art model.

Human-object interaction(HOI) detection is a critical task in scene understanding. The goal is to infer the triplet <subject, predicate, object> in a scene. In this work, we note that the human pose itself as well as the relative spatial information of the human pose with respect to the target object can provide informative cues for HOI detection. We contribute a Pose-based Modular Network (PMN) which explores the absolute pose features and relative spatial pose features to improve HOI detection and is fully compatible with existing networks. Our module consists of a branch that first processes the relative spatial pose features of each joint independently. Another branch updates the absolute pose features via fully connected graph structures. The processed pose features are then fed into an action classifier. To evaluate our proposed method, we combine the module with the state-of-the-art model named VS-GATs and obtain significant improvement on two public benchmarks: V-COCO and HICO-DET, which shows its efficacy and flexibility. Code is available at \url{https://github.com/birlrobotics/PMN}.

View on arXiv PDF Code

Similar