Adaptive Neuron Apoptosis for Accelerating Deep Learning on Large Scale Systems
This addresses the computational bottleneck in training large-scale deep learning models, offering significant speedups and efficiency gains, though it is an incremental improvement on existing pruning techniques.
The paper tackles the problem of accelerating deep learning training by adaptively removing redundant neurons during training, achieving 2-3x faster training times and up to 30x parameter reduction on datasets like ImageNet, while improving accuracy on the Higgs Boson dataset from 0.88 to 0.94 AUC.
We present novel techniques to accelerate the convergence of Deep Learning algorithms by conducting low overhead removal of redundant neurons -- apoptosis of neurons -- which do not contribute to model learning, during the training phase itself. We provide in-depth theoretical underpinnings of our heuristics (bounding accuracy loss and handling apoptosis of several neuron types), and present the methods to conduct adaptive neuron apoptosis. Specifically, we are able to improve the training time for several datasets by 2-3x, while reducing the number of parameters by up to 30x (4-5x on average) on datasets such as ImageNet classification. For the Higgs Boson dataset, our implementation improves the accuracy (measured by Area Under Curve (AUC)) for classification from 0.88/1 to 0.94/1, while reducing the number of parameters by 3x in comparison to existing literature. The proposed methods achieve a 2.44x speedup in comparison to the default (no apoptosis) algorithm.