SeReNe: Sensitivity based Regularization of Neurons for Structured Sparsity in Neural Networks
This method addresses the problem of deploying large neural networks on resource-constrained devices for practitioners by reducing model size.
The paper introduces SeReNe, a method that prunes entire neurons based on their sensitivity to network output variations. This approach achieves competitive compression ratios across various network architectures and datasets, enabling practical network footprint reduction.
Deep neural networks include millions of learnable parameters, making their deployment over resource-constrained devices problematic. SeReNe (Sensitivity-based Regularization of Neurons) is a method for learning sparse topologies with a structure, exploiting neural sensitivity as a regularizer. We define the sensitivity of a neuron as the variation of the network output with respect to the variation of the activity of the neuron. The lower the sensitivity of a neuron, the less the network output is perturbed if the neuron output changes. By including the neuron sensitivity in the cost function as a regularization term, we areable to prune neurons with low sensitivity. As entire neurons are pruned rather then single parameters, practical network footprint reduction becomes possible. Our experimental results on multiple network architectures and datasets yield competitive compression ratios with respect to state-of-the-art references.