SINF: Semantic Neural Network Inference with Semantic Subgraphs
This addresses the problem of high computational and energy costs for DNN inference, particularly on resource-constrained devices, by offering a novel pruning approach that is incremental but shows strong specific gains.
This paper tackles the problem of reducing computational load in deep neural networks (DNNs) by proposing SINF, a method that creates semantic subgraphs based on a Discriminative Capability Score (DCS), resulting in up to 35% reduction in inference time with minimal accuracy loss (e.g., 0.17% for VGG19 on CIFAR100) and improved energy efficiency (e.g., 51% more efficient on Raspberry Pi).
This paper proposes Semantic Inference (SINF) that creates semantic subgraphs in a Deep Neural Network(DNN) based on a new Discriminative Capability Score (DCS) to drastically reduce the DNN computational load with limited performance loss.~We evaluate the performance SINF on VGG16, VGG19, and ResNet50 DNNs trained on CIFAR100 and a subset of the ImageNet dataset. Moreover, we compare its performance against 6 state-of-the-art pruning approaches. Our results show that (i) on average, SINF reduces the inference time of VGG16, VGG19, and ResNet50 respectively by up to 29%, 35%, and 15% with only 3.75%, 0.17%, and 6.75% accuracy loss for CIFAR100 while for ImageNet benchmark, the reduction in inference time is 18%, 22%, and 9% for accuracy drop of 3%, 2.5%, and 6%; (ii) DCS achieves respectively up to 3.65%, 4.25%, and 2.36% better accuracy with VGG16, VGG19, and ResNet50 with respect to existing discriminative scores for CIFAR100 and the same for ImageNet is 8.9%, 5.8%, and 5.2% respectively. Through experimental evaluation on Raspberry Pi and NVIDIA Jetson Nano, we show SINF is about 51% and 38% more energy efficient and takes about 25% and 17% less inference time than the base model for CIFAR100 and ImageNet.