Modular network for high accuracy object detection
This work addresses accuracy improvements in object detection for computer vision applications, presenting an incremental modular approach.
The paper tackles object detection accuracy by proposing a modular convolutional neural network with a two-stage hierarchical structure, achieving a classification error reduction from 12% to 2.5%-4.5% and a 0.94 mAP.
We present a novel modular object detection convolutional neural network that significantly improves the accuracy of object detection. The network consists of two stages in a hierarchical structure. The first stage is a network that detects general classes. The second stage consists of separate networks to refine the classification and localization of each of the general classes objects. Compared to a state of the art object detection networks the classification error in the modular network is improved by approximately 3-5 times, from 12% to 2.5 %-4.5%. This network is easy to implement and has a 0.94 mAP. The network architecture can be a platform to improve the accuracy of widespread state of the art object detection networks and other kinds of deep learning networks. We show that a deep learning network initialized by transfer learning becomes more accurate as the number of classes it later trained to detect becomes smaller.