CVJun 30, 2018

Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships

arXiv:1807.00119v1240 citations
Originality Incremental advance
AI Analysis

This work addresses object detection accuracy for computer vision applications by integrating contextual reasoning, though it builds incrementally on existing frameworks like Faster R-CNN.

The paper tackled object detection by incorporating scene-level context and instance-level relationships, formulating detection as graph structure inference, and reported performance improvements on PASCAL VOC and MS COCO datasets.

Context is important for accurate visual recognition. In this work we propose an object detection algorithm that not only considers object visual appearance, but also makes use of two kinds of context including scene contextual information and object relationships within a single image. Therefore, object detection is regarded as both a cognition problem and a reasoning problem when leveraging these structured information. Specifically, this paper formulates object detection as a problem of graph structure inference, where given an image the objects are treated as nodes in a graph and relationships between the objects are modeled as edges in such graph. To this end, we present a so-called Structure Inference Network (SIN), a detector that incorporates into a typical detection framework (e.g. Faster R-CNN) with a graphical model which aims to infer object state. Comprehensive experiments on PASCAL VOC and MS COCO datasets indicate that scene context and object relationships truly improve the performance of object detection with more desirable and reasonable outputs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes