CVNov 15, 2018

LinkNet: Relational Embedding for Scene Graph

Sanghyun Woo, Dahun Kim, Donghyeon Cho, In So Kweon

arXiv:1811.06410v125.4157 citationsh-index: 62

Originality Incremental advance

AI Analysis

This addresses the problem of structured image understanding for computer vision applications, representing an incremental improvement over prior methods.

The paper tackles scene graph generation from images by modeling inter-dependency among all object instances, achieving state-of-the-art results on the Visual Genome benchmark.

Objects and their relationships are critical contents for image understanding. A scene graph provides a structured description that captures these properties of an image. However, reasoning about the relationships between objects is very challenging and only a few recent works have attempted to solve the problem of generating a scene graph from an image. In this paper, we present a method that improves scene graph generation by explicitly modeling inter-dependency among the entire object instances. We design a simple and effective relational embedding module that enables our model to jointly represent connections among all related objects, rather than focus on an object in isolation. Our method significantly benefits the main part of the scene graph generation task: relationship classification. Using it on top of a basic Faster R-CNN, our model achieves state-of-the-art results on the Visual Genome benchmark. We further push the performance by introducing global context encoding module and geometrical layout encoding module. We validate our final model, LinkNet, through extensive ablation studies, demonstrating its efficacy in scene graph generation.

View on arXiv PDF

Similar