CVFeb 19, 2017

Zoom Out-and-In Network with Recursive Training for Object Proposal

arXiv:1702.05711v123 citations
Originality Incremental advance
AI Analysis

This work addresses object detection for computer vision applications, presenting an incremental improvement over existing methods.

The paper tackles the problem of generating object proposals for detection by using a zoom-out-and-in network with recursive training, achieving state-of-the-art performance on ILSVRC DET and MS COCO datasets with around a 2% increase in average precision.

In this paper, we propose a zoom-out-and-in network for generating object proposals. We utilize different resolutions of feature maps in the network to detect object instances of various sizes. Specifically, we divide the anchor candidates into three clusters based on the scale size and place them on feature maps of distinct strides to detect small, medium and large objects, respectively. Deeper feature maps contain region-level semantics which can help shallow counterparts to identify small objects. Therefore we design a zoom-in sub-network to increase the resolution of high level features via a deconvolution operation. The high-level features with high resolution are then combined and merged with low-level features to detect objects. Furthermore, we devise a recursive training pipeline to consecutively regress region proposals at the training stage in order to match the iterative regression at the testing stage. We demonstrate the effectiveness of the proposed method on ILSVRC DET and MS COCO datasets, where our algorithm performs better than the state-of-the-arts in various evaluation metrics. It also increases average precision by around 2% in the detection system.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes