CVJul 13, 2020

Location-Aware Box Reasoning for Anchor-Based Single-Shot Object Detection

arXiv:2007.06233v116 citations
AI Analysis

This work addresses a specific bottleneck in object detection for computer vision applications, offering an incremental improvement over existing anchor-based single-shot detectors.

The paper tackles the problem of bounding box quality in single-shot object detectors by proposing a location-aware anchor-based reasoning (LAAR) method that integrates localization scores with classification confidences for better box selection in non-maximum suppression, resulting in enhanced performance on MS COCO and PASCAL VOC benchmarks.

In the majority of object detection frameworks, the confidence of instance classification is used as the quality criterion of predicted bounding boxes, like the confidence-based ranking in non-maximum suppression (NMS). However, the quality of bounding boxes, indicating the spatial relations, is not only correlated with the classification scores. Compared with the region proposal network (RPN) based detectors, single-shot object detectors suffer the box quality as there is a lack of pre-selection of box proposals. In this paper, we aim at single-shot object detectors and propose a location-aware anchor-based reasoning (LAAR) for the bounding boxes. LAAR takes both the location and classification confidences into consideration for the quality evaluation of bounding boxes. We introduce a novel network block to learn the relative location between the anchors and the ground truths, denoted as a localization score, which acts as a location reference during the inference stage. The proposed localization score leads to an independent regression branch and calibrates the bounding box quality by scoring the predicted localization score so that the best-qualified bounding boxes can be picked up in NMS. Experiments on MS COCO and PASCAL VOC benchmarks demonstrate that the proposed location-aware framework enhances the performances of current anchor-based single-shot object detection frameworks and yields consistent and robust detection results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes