CVLGMLApr 5, 2020

DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation

arXiv:2004.02164v5116 citations
AI Analysis

This addresses the efficiency bottleneck in neural network pruning for resource-constrained applications, offering an incremental improvement over existing methods.

The paper tackles the problem of inefficient sparsity allocation in budgeted pruning by proposing Differentiable Sparsity Allocation (DSA), which uses gradient-based optimization to find layer-wise pruning ratios in continuous space, achieving superior performance and reducing pruning time by at least 1.5x on CIFAR-10 and ImageNet.

Budgeted pruning is the problem of pruning under resource constraints. In budgeted pruning, how to distribute the resources across layers (i.e., sparsity allocation) is the key problem. Traditional methods solve it by discretely searching for the layer-wise pruning ratios, which lacks efficiency. In this paper, we propose Differentiable Sparsity Allocation (DSA), an efficient end-to-end budgeted pruning flow. Utilizing a novel differentiable pruning process, DSA finds the layer-wise pruning ratios with gradient-based optimization. It allocates sparsity in continuous space, which is more efficient than methods based on discrete evaluation and search. Furthermore, DSA could work in a pruning-from-scratch manner, whereas traditional budgeted pruning methods are applied to pre-trained models. Experimental results on CIFAR-10 and ImageNet show that DSA could achieve superior performance than current iterative budgeted pruning methods, and shorten the time cost of the overall pruning process by at least 1.5x in the meantime.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes