CVAug 4, 2016

UnitBox: An Advanced Object Detection Network

arXiv:1608.01471v11719 citations
Originality Incremental advance
AI Analysis

This addresses localization accuracy for object detection systems, particularly in face detection, but is incremental as it builds on existing deep CNN frameworks.

The paper tackles the problem of inaccurate object localization in deep CNN-based detection by introducing an IoU loss function that regresses bounding box bounds as a correlated unit, resulting in state-of-the-art performance on the FDDB face detection benchmark.

In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the $\ell_2$ loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union ($IoU$) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of $IoU$ loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes