LGARJul 9, 2021

Structured Model Pruning of Convolutional Networks on Tensor Processing Units

arXiv:2107.04191v274 citations
AI Analysis

This work addresses efficiency challenges for deploying neural networks on specialized hardware like TPUs, but it is incremental as it applies existing pruning methods to new hardware.

The paper tackled the high computational and storage requirements of convolutional neural networks by evaluating structured model pruning methods on Tensor Processing Units (TPUs), showing that it can significantly improve memory usage and speed without accuracy loss, especially on small datasets like CIFAR-10.

The deployment of convolutional neural networks is often hindered by high computational and storage requirements. Structured model pruning is a promising approach to alleviate these requirements. Using the VGG-16 model as an example, we measure the accuracy-efficiency trade-off for various structured model pruning methods and datasets (CIFAR-10 and ImageNet) on Tensor Processing Units (TPUs). To measure the actual performance of models, we develop a structured model pruning library for TensorFlow2 to modify models in place (instead of adding mask layers). We show that structured model pruning can significantly improve model memory usage and speed on TPUs without losing accuracy, especially for small datasets (e.g., CIFAR-10).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes