Depthwise Multiception Convolution for Reducing Network Parameters without Sacrificing Accuracy
This work offers an incremental improvement for researchers and practitioners looking to deploy deep learning models with reduced memory and storage requirements.
This paper addresses the problem of high parameter counts in deep convolutional neural networks by proposing depthwise multiception convolution (Multiception). Multiception reduces parameters by 32.48% on average while maintaining or improving accuracy across five popular CNN models on four benchmark datasets.
Deep convolutional neural networks have been proven successful in multiple benchmark challenges in recent years. However, the performance improvements are heavily reliant on increasingly complex network architecture and a high number of parameters, which require ever increasing amounts of storage and memory capacity. Depthwise separable convolution (DSConv) can effectively reduce the number of required parameters through decoupling standard convolution into spatial and cross-channel convolution steps. However, the method causes a degradation of accuracy. To address this problem, we present depthwise multiception convolution, termed Multiception, which introduces layer-wise multiscale kernels to learn multiscale representations of all individual input channels simultaneously. We have carried out the experiment on four benchmark datasets, i.e. Cifar-10, Cifar-100, STL-10 and ImageNet32x32, using five popular CNN models, Multiception achieved accuracy promotion in all models and demonstrated higher accuracy performance compared to related works. Meanwhile, Multiception significantly reduces the number of parameters of standard convolution-based models by 32.48% on average while still preserving accuracy.