LGCVNov 1, 2021

Learning Pruned Structure and Weights Simultaneously from Scratch: an Attention based Approach

arXiv:2111.02399v21 citations
Originality Incremental advance
AI Analysis

This addresses the need for more efficient neural networks with reduced storage and improved runtime for practitioners, though it is an incremental improvement over existing pruning techniques.

The paper tackles the problem of deep learning model compression by proposing ASWL, a novel unstructured pruning pipeline that simultaneously learns pruned structure and weights from scratch using layer-wise attention, achieving superior pruning results in terms of accuracy, pruning ratio, and efficiency on datasets like MNIST, Cifar10, and ImageNet compared to state-of-the-art methods.

As a deep learning model typically contains millions of trainable weights, there has been a growing demand for a more efficient network structure with reduced storage space and improved run-time efficiency. Pruning is one of the most popular network compression techniques. In this paper, we propose a novel unstructured pruning pipeline, Attention-based Simultaneous sparse structure and Weight Learning (ASWL). Unlike traditional channel-wise or weight-wise attention mechanism, ASWL proposed an efficient algorithm to calculate the pruning ratio through layer-wise attention for each layer, and both weights for the dense network and the sparse network are tracked so that the pruned structure is simultaneously learned from randomly initialized weights. Our experiments on MNIST, Cifar10, and ImageNet show that ASWL achieves superior pruning results in terms of accuracy, pruning ratio and operating efficiency when compared with state-of-the-art network pruning methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes