LGNEJun 4, 2025

Spiking Brain Compression: Exploring One-Shot Post-Training Pruning and Quantization for Spiking Neural Networks

arXiv:2506.03996v21 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the efficiency challenge for SNNs on resource-constrained neuromorphic hardware, offering a faster compression method that is incremental over prior iterative approaches.

The paper tackles the problem of compressing Spiking Neural Networks (SNNs) for efficient deployment on neuromorphic hardware by proposing a one-shot post-training pruning and quantization framework called Spiking Brain Compression (SBC), which achieves state-of-the-art results with accuracy gains of single-digit to double-digit percentages compared to existing methods and reduces compression time by 2-3 orders of magnitude.

Spiking Neural Networks (SNNs) have emerged as a new generation of energy-efficient neural networks suitable for implementation on neuromorphic hardware. As neuromorphic hardware has limited memory and computing resources, weight pruning and quantization have recently been explored to improve SNNs' efficiency. State-of-the-art SNN pruning/quantization methods employ multiple compression and training iterations, increasing the cost for pre-trained or very large SNNs. In this paper, we propose a new one-shot post-training pruning/quantization framework, Spiking Brain Compression (SBC), that extends the Optimal Brain Compression (OBC) method to SNNs. SBC replaces the current-based loss found in OBC with a spike train-based objective whose Hessian is cheaply computable, allowing a single backward pass to prune or quantize synapses and analytically rescale the rest. Our experiments on models trained with neuromorphic datasets (N-MNIST, CIFAR10-DVS, DVS128-Gesture) and large static datasets (CIFAR-100, ImageNet) show state-of-the-art results for one-shot post-training compression methods on SNNs, with single-digit to double-digit accuracy gains compared to OBC. SBC also approaches the accuracy of costly iterative methods, while cutting compression time by 2-3 orders of magnitude.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes