CVLGJun 22, 2017

Pixels to Graphs by Associative Embedding

arXiv:1706.07365v2240 citations
Originality Incremental advance
AI Analysis

This addresses scene understanding for computer vision applications, representing an incremental improvement in scene graph generation.

The paper tackles the problem of generating scene graphs from images by training a convolutional neural network to produce full graph definitions end-to-end using associative embeddings, achieving state-of-the-art performance on the Visual Genome dataset.

Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph definition. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes