Hardware-aware Pruning of DNNs using LFSR-Generated Pseudo-Random Indices
This work addresses the problem of reducing memory footprint and hardware costs for DNNs in embedded systems, presenting an incremental improvement over existing sparsification techniques.
The paper tackled the hardware overhead of pruning deep neural networks for embedded applications by proposing a hardware-aware pruning method using LFSR-generated pseudo-random indices, resulting in energy and area savings of up to 63.96% and 64.23% for VGG-16 on down-sampled ImageNet while maintaining compression rate and accuracy.
Deep neural networks (DNNs) have been emerged as the state-of-the-art algorithms in broad range of applications. To reduce the memory foot-print of DNNs, in particular for embedded applications, sparsification techniques have been proposed. Unfortunately, these techniques come with a large hardware overhead. In this paper, we present a hardware-aware pruning method where the locations of non-zero weights are derived in real-time from a Linear Feedback Shift Registers (LFSRs). Using the proposed method, we demonstrate a total saving of energy and area up to 63.96% and 64.23% for VGG-16 network on down-sampled ImageNet, respectively for iso-compression-rate and iso-accuracy.