CVJan 18, 2017

Compression of Deep Neural Networks for Image Instance Retrieval

Vijay Chandrasekhar, Jie Lin, Qianli Liao, Olivier Morère, Antoine Veillard, Lingyu Duan, Tomaso Poggio

arXiv:1701.04923v14.426 citations

Originality Synthesis-oriented

AI Analysis

This work addresses storage constraints for deploying deep neural networks in mobile and hardware applications, though it is incremental as it applies existing compression methods to a specific task.

The paper tackles the problem of large model sizes in CNN-based global descriptors for image instance retrieval by applying compression techniques like quantization and pruning, achieving models of a few MBs with negligible performance loss.

Image instance retrieval is the problem of retrieving images from a database which contain the same object. Convolutional Neural Network (CNN) based descriptors are becoming the dominant approach for generating {\it global image descriptors} for the instance retrieval problem. One major drawback of CNN-based {\it global descriptors} is that uncompressed deep neural network models require hundreds of megabytes of storage making them inconvenient to deploy in mobile applications or in custom hardware. In this work, we study the problem of neural network model compression focusing on the image instance retrieval task. We study quantization, coding, pruning and weight sharing techniques for reducing model size for the instance retrieval problem. We provide extensive experimental results on the trade-off between retrieval performance and model size for different types of networks on several data sets providing the most comprehensive study on this topic. We compress models to the order of a few MBs: two orders of magnitude smaller than the uncompressed models while achieving negligible loss in retrieval performance.

View on arXiv PDF

Similar