IVCVLGDec 7, 2020

The Role of Regularization in Shaping Weight and Node Pruning Dependency and Dynamics

arXiv:2012.03827v11 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing deep neural network capacity for practitioners by demonstrating the effectiveness of L2 regularization in conjunction with a novel stochastic pruning method, offering incremental improvements in pruning techniques.

This paper investigates the role of L1 and L2 regularization in network pruning, particularly focusing on node pruning dynamics. They propose a stochastic weight pruning framework that, when combined with weight decay, successfully removes 50% of MLP nodes for MNIST, 60% of VGG-16 filters for CIFAR10, 60% of U-Net channels for instance segmentation, and 50% of CNN channels for COVID-19 detection, while maintaining competitive accuracy.

The pressing need to reduce the capacity of deep neural networks has stimulated the development of network dilution methods and their analysis. While the ability of $L_1$ and $L_0$ regularization to encourage sparsity is often mentioned, $L_2$ regularization is seldom discussed in this context. We present a novel framework for weight pruning by sampling from a probability function that favors the zeroing of smaller weights. In addition, we examine the contribution of $L_1$ and $L_2$ regularization to the dynamics of node pruning while optimizing for weight pruning. We then demonstrate the effectiveness of the proposed stochastic framework when used together with a weight decay regularizer on popular classification models in removing 50% of the nodes in an MLP for MNIST classification, 60% of the filters in VGG-16 for CIFAR10 classification, and on medical image models in removing 60% of the channels in a U-Net for instance segmentation and 50% of the channels in CNN model for COVID-19 detection. For these node-pruned networks, we also present competitive weight pruning results that are only slightly less accurate than the original, dense networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes