CVJan 18, 2020

NETNet: Neighbor Erasing and Transferring Network for Better Single Shot Object Detection

arXiv:2001.06690v137 citations
Originality Incremental advance
AI Analysis

This work addresses scale-aware detection for real-time object detection systems, representing an incremental improvement over existing single-shot detectors.

The paper tackled the problem of scale variations in single-shot object detectors, which cause missed small objects and false detections of large object parts, by proposing a Neighbor Erasing and Transferring (NET) mechanism to reconfigure pyramid features for scale-aware detection, achieving 38.5% AP at 27 FPS and 32.0% AP at 55 FPS on MS COCO.

Due to the advantages of real-time detection and improved performance, single-shot detectors have gained great attention recently. To solve the complex scale variations, single-shot detectors make scale-aware predictions based on multiple pyramid layers. However, the features in the pyramid are not scale-aware enough, which limits the detection performance. Two common problems in single-shot detectors caused by object scale variations can be observed: (1) small objects are easily missed; (2) the salient part of a large object is sometimes detected as an object. With this observation, we propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features. In NET, a Neighbor Erasing Module (NEM) is designed to erase the salient features of large objects and emphasize the features of small objects in shallow layers. A Neighbor Transferring Module (NTM) is introduced to transfer the erased features and highlight large objects in deep layers. With this mechanism, a single-shot network called NETNet is constructed for scale-aware object detection. In addition, we propose to aggregate nearest neighboring pyramid features to enhance our NET. NETNet achieves 38.5% AP at a speed of 27 FPS and 32.0% AP at a speed of 55 FPS on MS COCO dataset. As a result, NETNet achieves a better trade-off for real-time and accurate object detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes