CVSep 23, 2018

Bounding Box Regression with Uncertainty for Accurate Object Detection

arXiv:1809.08545v3539 citationsHas Code
AI Analysis

This addresses localization inaccuracies in object detection for computer vision applications, offering a significant improvement over existing methods.

The paper tackles the problem of ambiguous bounding box labeling in object detection by proposing a novel regression loss that learns transformation and localization variance, improving localization accuracy without extra computation. On MS-COCO, it boosts VGG-16 Faster R-CNN AP from 23.6% to 29.1% and ResNet-50-FPN Mask R-CNN AP by 1.8% and AP90 by 6.2%, outperforming previous state-of-the-art methods.

Large-scale object detection datasets (e.g., MS-COCO) try to define the ground truth bounding boxes as clear as possible. However, we observe that ambiguities are still introduced when labeling the bounding boxes. In this paper, we propose a novel bounding box regression loss for learning bounding box transformation and localization variance together. Our loss greatly improves the localization accuracies of various architectures with nearly no additional computation. The learned localization variance allows us to merge neighboring bounding boxes during non-maximum suppression (NMS), which further improves the localization performance. On MS-COCO, we boost the Average Precision (AP) of VGG-16 Faster R-CNN from 23.6% to 29.1%. More importantly, for ResNet-50-FPN Mask R-CNN, our method improves the AP and AP90 by 1.8% and 6.2% respectively, which significantly outperforms previous state-of-the-art bounding box refinement methods. Our code and models are available at: github.com/yihui-he/KL-Loss

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes