A Probabilistic Approach to Neural Network Pruning
This provides theoretical insights for researchers and practitioners in machine learning seeking to compress neural networks without losing performance, though it is incremental as it builds on existing pruning methods.
The paper tackles the problem of understanding the capabilities and compression ratios of pruned neural networks by theoretically analyzing random and magnitude-based pruning techniques on fully-connected and convolutional networks, establishing that pruned networks can achieve expressive power within any specified bound from the target network.
Neural network pruning techniques reduce the number of parameters without compromising predicting ability of a network. Many algorithms have been developed for pruning both over-parameterized fully-connected networks (FCNs) and convolutional neural networks (CNNs), but analytical studies of capabilities and compression ratios of such pruned sub-networks are lacking. We theoretically study the performance of two pruning techniques (random and magnitude-based) on FCNs and CNNs. Given a target network {whose weights are independently sampled from appropriate distributions}, we provide a universal approach to bound the gap between a pruned and the target network in a probabilistic sense. The results establish that there exist pruned networks with expressive power within any specified bound from the target network.