Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant
This work addresses the resource demands of deep learning for embedded applications like autonomous driving, but it is incremental as it extends an existing toolbox for empirical analysis.
The paper investigates sparsity in deep neural networks using the extended TensorQuant toolbox, focusing on deeper topologies to show differences in sparsity for activations, weights, and gradients across various classification problems.
Deep learning is finding its way into the embedded world with applications such as autonomous driving, smart sensors and aug- mented reality. However, the computation of deep neural networks is demanding in energy, compute power and memory. Various approaches have been investigated to reduce the necessary resources, one of which is to leverage the sparsity occurring in deep neural networks due to the high levels of redundancy in the network parameters. It has been shown that sparsity can be promoted specifically and the achieved sparsity can be very high. But in many cases the methods are evaluated on rather small topologies. It is not clear if the results transfer onto deeper topologies. In this paper, the TensorQuant toolbox has been extended to offer a platform to investigate sparsity, especially in deeper models. Several practical relevant topologies for varying classification problem sizes are investigated to show the differences in sparsity for activations, weights and gradients.