CVAINENov 1, 2018

Hybrid Pruning: Thinner Sparse Networks for Fast Inference on Edge Devices

arXiv:1811.00482v117 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of enabling fast and efficient inference on edge devices like security cameras and drones, representing an incremental improvement in model compression techniques.

The paper tackles the problem of deploying modern neural networks on resource-constrained edge devices by introducing hybrid pruning, which combines coarse-grained channel and fine-grained weight pruning to reduce model size, computation, and power demands with minimal accuracy loss, achieving significantly better results on ResNet50 on ImageNet compared to existing work, even with hardware-friendly channel constraints.

We introduce hybrid pruning which combines both coarse-grained channel and fine-grained weight pruning to reduce model size, computation and power demands with no to little loss in accuracy for enabling modern networks deployment on resource-constrained devices, such as always-on security cameras and drones. Additionally, to effectively perform channel pruning, we propose a fast sensitivity test that helps us quickly identify the sensitivity of within and across layers of a network to the output accuracy for target multiplier accumulators (MACs) or accuracy tolerance. Our experiment shows significantly better results on ResNet50 on ImageNet compared to existing work, even with an additional constraint of channels be hardware-friendly number.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes