CVDec 5, 2017

R-FCN-3000 at 30fps: Decoupling Detection and Classification

arXiv:1712.01802v198 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient and accurate object detection in computer vision applications, representing an incremental improvement over existing methods.

The paper tackles the problem of large-scale real-time object detection by decoupling objectness detection and classification, achieving an mAP of 34.9% on ImageNet detection dataset and processing 30 images per second, outperforming YOLO-9000 by 18%.

We present R-FCN-3000, a large-scale real-time object detector in which objectness detection and classification are decoupled. To obtain the detection score for an RoI, we multiply the objectness score with the fine-grained classification score. Our approach is a modification of the R-FCN architecture in which position-sensitive filters are shared across different object classes for performing localization. For fine-grained classification, these position-sensitive filters are not needed. R-FCN-3000 obtains an mAP of 34.9% on the ImageNet detection dataset and outperforms YOLO-9000 by 18% while processing 30 images per second. We also show that the objectness learned by R-FCN-3000 generalizes to novel classes and the performance increases with the number of training object classes - supporting the hypothesis that it is possible to learn a universal objectness detector. Code will be made available.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes