Efficient Hardware Realization of Convolutional Neural Networks using Intra-Kernel Regular Pruning
This addresses the problem of high computational and memory demands for hardware deployment of CNNs, though it is incremental as it builds on existing pruning methods.
The paper tackles the challenge of deploying deep convolutional neural networks (CNNs) in hardware by proposing an Intra-Kernel Regular (IKR) pruning scheme, achieving up to 10x parameter reduction and 7x computational reduction with less than 1% accuracy degradation.
The recent trend toward increasingly deep convolutional neural networks (CNNs) leads to a higher demand of computational power and memory storage. Consequently, the deployment of CNNs in hardware has become more challenging. In this paper, we propose an Intra-Kernel Regular (IKR) pruning scheme to reduce the size and computational complexity of the CNNs by removing redundant weights at a fine-grained level. Unlike other pruning methods such as Fine-Grained pruning, IKR pruning maintains regular kernel structures that are exploitable in a hardware accelerator. Experimental results demonstrate up to 10x parameter reduction and 7x computational reduction at a cost of less than 1% degradation in accuracy versus the un-pruned case.