Compressed Object Detection
This work provides a method to reduce the computational cost of object detection models, making them more deployable on resource-constrained hardware like mobile phones and IoT devices.
This paper addresses the computational expense of deep learning models for object detection by extending pruning and weight sharing techniques. The authors successfully compressed a state-of-the-art object detection model by 30.0% without any performance loss.
Deep learning approaches have achieved unprecedented performance in visual recognition tasks such as object detection and pose estimation. However, state-of-the-art models have millions of parameters represented as floats which make them computationally expensive and constrain their deployment on hardware such as mobile phones and IoT nodes. Most commonly, activations of deep neural networks tend to be sparse thus proving that models are over parametrized with redundant neurons. Model compression techniques, such as pruning and quantization, have recently shown promising results by improving model complexity with little loss in performance. In this work, we extended pruning, a compression technique that discards unnecessary model connections, and weight sharing techniques for the task of object detection. With our approach, we are able to compress a state-of-the-art object detection model by 30.0% without a loss in performance. We also show that our compressed model can be easily initialized with existing pre-trained weights, and thus is able to fully utilize published state-of-the-art model zoos.