Dynamic Neural Network Channel Execution for Efficient Training
This work addresses the need for more efficient training and inference in deep learning, particularly for resource-constrained environments, by introducing a method that integrates pruning into training, though it is incremental as it builds on existing pruning and dynamic execution techniques.
The paper tackles the problem of reducing computational and memory costs in neural network training and inference by proposing a dynamic channel execution method that selects only salient channels during training, achieving up to 4x reduction in computational cost and 9x reduction in parameter count on CNNs like VGGNet, ResNet, and DenseNet for image classification.
Existing methods for reducing the computational burden of neural networks at run-time, such as parameter pruning or dynamic computational path selection, focus solely on improving computational efficiency during inference. On the other hand, in this work, we propose a novel method which reduces the memory footprint and number of computing operations required for training and inference. Our framework efficiently integrates pruning as part of the training procedure by exploring and tracking the relative importance of convolutional channels. At each training step, we select only a subset of highly salient channels to execute according to the combinatorial upper confidence bound algorithm, and run a forward and backward pass only on these activated channels, hence learning their parameters. Consequently, we enable the efficient discovery of compact models. We validate our approach empirically on state-of-the-art CNNs - VGGNet, ResNet and DenseNet, and on several image classification datasets. Results demonstrate our framework for dynamic channel execution reduces computational cost up to 4x and parameter count up to 9x, thus reducing the memory and computational demands for discovering and training compact neural network models.