Adaptive Pixel-wise Structured Sparse Network for Efficient CNNs
This work addresses efficiency bottlenecks in CNNs for vision tasks, offering a hardware-friendly solution with online adaptability, though it is incremental in optimizing existing methods.
The paper tackles the problem of accelerating deep CNN models by proposing a spatially adaptive framework that dynamically generates pixel-wise sparsity based on input images, resulting in significant computational savings: 30%-70% MACs reduction for image classification with minimal accuracy drop and over 90% MACs reduction for super-resolution with slight quality decreases.
To accelerate deep CNN models, this paper proposes a novel spatially adaptive framework that can dynamically generate pixel-wise sparsity according to the input image. The sparse scheme is pixel-wise refined, regional adaptive under a unified importance map, which makes it friendly to hardware implementation. A sparse controlling method is further presented to enable online adjustment for applications with different precision/latency requirements. The sparse model is applicable to a wide range of vision tasks. Experimental results show that this method efficiently improve the computing efficiency for both image classification using ResNet-18 and super resolution using SRResNet. On image classification task, our method can save 30%-70% MACs with a slightly drop in top-1 and top-5 accuracy. On super resolution task, our method can reduce more than 90% MACs while only causing around 0.1 dB and 0.01 decreasing in PSNR and SSIM. Hardware validation is also included.