NE CVMay 10, 2023

Enhancing the Performance of Transformer-based Spiking Neural Networks by SNN-optimized Downsampling with Precise Gradient Backpropagation

Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Zhengyu Ma, Huihui Zhou, Xiaopeng Fan, Yonghong Tian

arXiv:2305.05954v315.521 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a critical challenge in deep SNNs for low-power AI applications, offering incremental improvements in accuracy across multiple datasets.

The paper tackles the problem of imprecise gradient backpropagation in deep spiking neural networks (SNNs) due to improper downsampling design, and proposes ConvBN-MaxPooling-LIF (CML) to overcome this, achieving state-of-the-art performance with improvements such as +1.79% on ImageNet and +1.16% on CIFAR100 compared to Spikingformer.

Deep spiking neural networks (SNNs) have drawn much attention in recent years because of their low power consumption, biological rationality and event-driven property. However, state-of-the-art deep SNNs (including Spikformer and Spikingformer) suffer from a critical challenge related to the imprecise gradient backpropagation. This problem arises from the improper design of downsampling modules in these networks, and greatly hampering the overall model performance. In this paper, we propose ConvBN-MaxPooling-LIF (CML), an SNN-optimized downsampling with precise gradient backpropagation. We prove that CML can effectively overcome the imprecision of gradient backpropagation from a theoretical perspective. In addition, we evaluate CML on ImageNet, CIFAR10, CIFAR100, CIFAR10-DVS, DVS128-Gesture datasets, and show state-of-the-art performance on all these datasets with significantly enhanced performances compared with Spikingformer. For instance, our model achieves 77.64 $\%$ on ImageNet, 96.04 $\%$ on CIFAR10, 81.4$\%$ on CIFAR10-DVS, with + 1.79$\%$ on ImageNet, +1.16$\%$ on CIFAR100 compared with Spikingformer.

View on arXiv PDF Code

Similar