Spiking Layer-Adaptive Magnitude-based Pruning
This work addresses energy-efficient deployment of SNNs for edge computing, though it is incremental as it adapts existing pruning methods to temporal dynamics.
The paper tackled the problem of deploying Spiking Neural Networks (SNNs) efficiently by addressing performance degradation from naive pruning, proposing SLAMP, which achieved substantial reductions in connectivity and spiking operations while preserving accuracy on datasets like CIFAR10 and CIFAR10-DVS.
Spiking Neural Networks (SNNs) provide energy-efficient computation but their deployment is constrained by dense connectivity and high spiking operation costs. Existing magnitude-based pruning strategies, when naively applied to SNNs, fail to account for temporal accumulation, non-uniform timestep contributions, and membrane stability, often leading to severe performance degradation. This paper proposes Spiking Layer-Adaptive Magnitude-based Pruning (SLAMP), a theory-guided pruning framework that generalizes layer-adaptive magnitude pruning to temporal SNNs by explicitly controlling worst-case output distortion across layers and timesteps. SLAMP formulates sparsity allocation as a temporal distortion-constrained optimization problem, yielding time-aware layer importance scores that reduce to conventional layer-adaptive pruning in single-timestep limit. An efficient two-stage procedure is derived, combining temporal score estimation, global sparsity allocation, and magnitude pruning with retraining for stability recovery. Experiments on CIFAR10, CIFAR100, and the event-based CIFAR10-DVS datasets demonstrate that SLAMP achieves substantial connectivity and spiking operation reductions while preserving accuracy, enabling efficient and deployable SNN inference.