LGCVMLMay 29, 2018

Channel Gating Neural Networks

arXiv:1805.12549v2205 citations
Originality Highly original
AI Analysis

This work addresses efficiency issues in CNN inference for applications requiring real-time or low-power deployment, offering a hardware-friendly solution with incremental improvements over existing pruning methods.

This paper tackles the problem of reducing computation costs in convolutional neural networks by introducing channel gating, a dynamic pruning scheme that skips computations on less important input channels, achieving 2.7-8.0x FLOP reduction on CIFAR-10 and 2.6x reduction on ImageNet with minimal accuracy loss.

This paper introduces channel gating, a dynamic, fine-grained, and hardware-efficient pruning scheme to reduce the computation cost for convolutional neural networks (CNNs). Channel gating identifies regions in the features that contribute less to the classification result, and skips the computation on a subset of the input channels for these ineffective regions. Unlike static network pruning, channel gating optimizes CNN inference at run-time by exploiting input-specific characteristics, which allows substantially reducing the compute cost with almost no accuracy loss. We experimentally show that applying channel gating in state-of-the-art networks achieves 2.7-8.0$\times$ reduction in floating-point operations (FLOPs) and 2.0-4.4$\times$ reduction in off-chip memory accesses with a minimal accuracy loss on CIFAR-10. Combining our method with knowledge distillation reduces the compute cost of ResNet-18 by 2.6$\times$ without accuracy drop on ImageNet. We further demonstrate that channel gating can be realized in hardware efficiently. Our approach exhibits sparsity patterns that are well-suited to dense systolic arrays with minimal additional hardware. We have designed an accelerator for channel gating networks, which can be implemented using either FPGAs or ASICs. Running a quantized ResNet-18 model for ImageNet, our accelerator achieves an encouraging speedup of 2.4$\times$ on average, with a theoretical FLOP reduction of 2.8$\times$.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes