The Power of Sparsity in Convolutional Neural Networks
This addresses the problem of resource efficiency in neural networks for practitioners, though it appears incremental as it builds on existing sparsity techniques.
The paper tackles the high computational and memory demands of deep convolutional networks by proposing a strategy to deactivate connections between filters using channel-wise sparse convolution, which leads to significantly better results in run-time and memory savings for large networks like VGG and Inception V3 compared to baseline approaches.
Deep convolutional networks are well-known for their high computational and memory demands. Given limited resources, how does one design a network that balances its size, training time, and prediction accuracy? A surprisingly effective approach to trade accuracy for size and speed is to simply reduce the number of channels in each convolutional layer by a fixed fraction and retrain the network. In many cases this leads to significantly smaller networks with only minimal changes to accuracy. In this paper, we take a step further by empirically examining a strategy for deactivating connections between filters in convolutional layers in a way that allows us to harvest savings both in run-time and memory for many network architectures. More specifically, we generalize 2D convolution to use a channel-wise sparse connection structure and show that this leads to significantly better results than the baseline approach for large networks including VGG and Inception V3.