Training Sparse Neural Networks
This addresses the efficiency issue in large-scale computer vision tasks by enabling more efficient sparse neural networks, though it appears incremental as it builds on existing sparse computation techniques.
The paper tackles the problem of training deep neural networks for computer vision by introducing a method to implicitly use sparse computations, achieving state-of-the-art compression results for sparse models.
Deep neural networks with lots of parameters are typically used for large-scale computer vision tasks such as image classification. This is a result of using dense matrix multiplications and convolutions. However, sparse computations are known to be much more efficient. In this work, we train and build neural networks which implicitly use sparse computations. We introduce additional gate variables to perform parameter selection and show that this is equivalent to using a spike-and-slab prior. We experimentally validate our method on both small and large networks and achieve state-of-the-art compression results for sparse neural network models.