GnetDet: Object Detection Optimized on a 224mW CNN Accelerator Chip at the Speed of 106FPS
This work addresses the need for low-power, high-speed object detection on embedded systems, but it is incremental as it focuses on optimization for a specific chip.
The paper tackled the problem of optimizing object detection for embedded devices by minimizing CPU load on a CNN accelerator chip, resulting in the GnetDet model that achieves 106 FPS on a 224mW chip with excellent accuracy.
Object detection is widely used on embedded devices. With the wide availability of CNN (Convolutional Neural Networks) accelerator chips, the object detection applications are expected to run with low power consumption, and high inference speed. In addition, the CPU load is expected to be as low as possible for a CNN accelerator chip working as a co-processor with a host CPU. In this paper, we optimize the object detection model on the CNN accelerator chip by minimizing the CPU load. The resulting model is called GnetDet. The experimental result shows that the GnetDet model running on a 224mW chip achieves the speed of 106FPS with excellent accuracy.