Embedding Differentiable Sparsity into Deep Neural Network
This addresses the need for more efficient and interpretable neural networks, though it appears incremental as it builds on existing sparsity techniques.
The paper tackles the problem of embedding sparsity into deep neural networks by enabling parameters to become exactly zero during training with stochastic gradient descent, allowing simultaneous learning of sparsified structure and weights, and it supports both structured and unstructured sparsity.
In this paper, we propose embedding sparsity into the structure of deep neural networks, where model parameters can be exactly zero during training with the stochastic gradient descent. Thus, it can learn the sparsified structure and the weights of networks simultaneously. The proposed approach can learn structured as well as unstructured sparsity.