HR-RCNN: Hierarchical Relational Reasoning for Object Detection
This work addresses the need for generalized relational reasoning in object detection, offering a solution for computer vision researchers and practitioners, though it appears incremental as it builds on existing relational methods.
The paper tackles the problem of limited relational reasoning in object detection by proposing a hierarchical relational reasoning framework (HR-RCNN) with a novel graph attention module (GAM), which shows great improvement on the COCO dataset for both object detection and instance segmentation.
Incorporating relational reasoning in neural networks for object recognition remains an open problem. Although many attempts have been made for relational reasoning, they generally only consider a single type of relationship. For example, pixel relations through self-attention (e.g., non-local networks), scale relations through feature fusion (e.g., feature pyramid networks), or object relations through graph convolutions (e.g., reasoning-RCNN). Little attention has been given to more generalized frameworks that can reason across these relationships. In this paper, we propose a hierarchical relational reasoning framework (HR-RCNN) for object detection, which utilizes a novel graph attention module (GAM). This GAM is a concise module that enables reasoning across heterogeneous nodes by operating on the graph edges directly. Leveraging heterogeneous relationships, our HR-RCNN shows great improvement on COCO dataset, for both object detection and instance segmentation.