CVJan 23, 2017

DSSD : Deconvolutional Single Shot Detector

arXiv:1701.06659v12054 citations
Originality Incremental advance
AI Analysis

This work addresses the need for better context integration in object detection for computer vision applications, representing an incremental improvement over existing methods.

The paper tackles the problem of improving object detection accuracy, especially for small objects, by introducing additional context through deconvolution layers in a single-shot detector framework, achieving 81.5% mAP on VOC2007 and 33.2% mAP on COCO.

The main contribution of this paper is an approach for introducing additional context into state-of-the-art general object detection. To achieve this we first combine a state-of-the-art classifier (Residual-101[14]) with a fast detection framework (SSD[18]). We then augment SSD+Residual-101 with deconvolution layers to introduce additional large-scale context in object detection and improve accuracy, especially for small objects, calling our resulting system DSSD for deconvolutional single shot detector. While these two contributions are easily described at a high-level, a naive implementation does not succeed. Instead we show that carefully adding additional stages of learned transformations, specifically a module for feed-forward connections in deconvolution and a new output module, enables this new approach and forms a potential way forward for further detection research. Results are shown on both PASCAL VOC and COCO detection. Our DSSD with $513 \times 513$ input achieves 81.5% mAP on VOC2007 test, 80.0% mAP on VOC2012 test, and 33.2% mAP on COCO, outperforming a state-of-the-art method R-FCN[3] on each dataset.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes