Fast Exploration of Weight Sharing Opportunities for CNN Compression
This work is significant for developers and researchers working on deploying CNNs on resource-constrained embedded devices, by making the optimization process more efficient.
This paper addresses the problem of long design space exploration (DSE) times when applying approximation techniques to compress Convolutional Neural Networks (CNNs) for low-power embedded devices. The authors propose an optimized exploration process that significantly reduces DSE time without compromising the quality of the compression.
The computational workload involved in Convolutional Neural Networks (CNNs) is typically out of reach for low-power embedded devices. There are a large number of approximation techniques to address this problem. These methods have hyper-parameters that need to be optimized for each CNNs using design space exploration (DSE). The goal of this work is to demonstrate that the DSE phase time can easily explode for state of the art CNN. We thus propose the use of an optimized exploration process to drastically reduce the exploration time without sacrificing the quality of the output.