Graph R-CNN for Scene Graph Generation
This addresses the problem of efficiently detecting objects and their relations in images for computer vision applications, representing a strong incremental advance.
The paper tackles scene graph generation by proposing Graph R-CNN, which includes a Relation Proposal Network to handle quadratic relation complexity and an attentional Graph Convolutional Network for contextual modeling, achieving state-of-the-art performance.
We propose a novel scene graph generation model called Graph R-CNN, that is both effective and efficient at detecting objects and their relations in images. Our model contains a Relation Proposal Network (RePN) that efficiently deals with the quadratic number of potential relations between objects in an image. We also propose an attentional Graph Convolutional Network (aGCN) that effectively captures contextual information between objects and relations. Finally, we introduce a new evaluation metric that is more holistic and realistic than existing metrics. We report state-of-the-art performance on scene graph generation as evaluated using both existing and our proposed metrics.