LGDec 1, 2017

Structured Deep Neural Network Pruning via Matrix Pivoting

Ranko Sredojevic, Shaoyi Cheng, Lazar Supic, Rawan Naous, Vladimir Stojanovic

arXiv:1712.01084v13.78 citations

Originality Incremental advance

AI Analysis

This work addresses resource constraints on mobile, embedded, and IoT devices by improving DNN pruning methods, though it appears incremental as it builds on existing pruning approaches.

The paper tackles the problem of making deep neural networks more efficient for edge devices by introducing pruning via matrix pivoting, which compromises between architecture-oblivious and architecture-aware pruning techniques, resulting in close to linear speed up with coefficient reduction.

Deep Neural Networks (DNNs) are the key to the state-of-the-art machine vision, sensor fusion and audio/video signal processing. Unfortunately, their computation complexity and tight resource constraints on the Edge make them hard to leverage on mobile, embedded and IoT devices. Due to great diversity of Edge devices, DNN designers have to take into account the hardware platform and application requirements during network training. In this work we introduce pruning via matrix pivoting as a way to improve network pruning by compromising between the design flexibility of architecture-oblivious and performance efficiency of architecture-aware pruning, the two dominant techniques for obtaining resource-efficient DNNs. We also describe local and global network optimization techniques for efficient implementation of the resulting pruned networks. In combination, the proposed pruning and implementation result in close to linear speed up with the reduction of network coefficients during pruning.

View on arXiv PDF

Similar