Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition
This work addresses the deployment challenge for edge devices by providing a plug-and-play compression module, though it is incremental as it builds on existing decomposition techniques.
The paper tackles the problem of deploying large Convolutional Neural Networks on resource-constrained edge devices by compressing convolutional layers using Generalized Kronecker Product Decomposition, achieving reduced memory usage and floating-point operations while outperforming state-of-the-art methods like Tensor-Train and Tensor-Ring on CIFAR-10 and ImageNet datasets.
Modern Convolutional Neural Network (CNN) architectures, despite their superiority in solving various problems, are generally too large to be deployed on resource constrained edge devices. In this paper, we reduce memory usage and floating-point operations required by convolutional layers in CNNs. We compress these layers by generalizing the Kronecker Product Decomposition to apply to multidimensional tensors, leading to the Generalized Kronecker Product Decomposition (GKPD). Our approach yields a plug-and-play module that can be used as a drop-in replacement for any convolutional layer. Experimental results for image classification on CIFAR-10 and ImageNet datasets using ResNet, MobileNetv2 and SeNet architectures substantiate the effectiveness of our proposed approach. We find that GKPD outperforms state-of-the-art decomposition methods including Tensor-Train and Tensor-Ring as well as other relevant compression methods such as pruning and knowledge distillation.