LGAICVNEJun 12, 2024

Self-Distillation Learning Based on Temporal-Spatial Consistency for Spiking Neural Networks

arXiv:2406.07862v19 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency and performance issues in SNN training for low-power AI applications, offering an incremental improvement over existing knowledge distillation techniques.

The paper tackles the computational cost and architectural complexity of using teacher models in knowledge distillation for spiking neural networks (SNNs) by proposing a self-distillation method based on temporal-spatial consistency, achieving superior performance on datasets like CIFAR10/100, ImageNet, CIFAR10-DVS, and DVS-Gesture without inference overhead.

Spiking neural networks (SNNs) have attracted considerable attention for their event-driven, low-power characteristics and high biological interpretability. Inspired by knowledge distillation (KD), recent research has improved the performance of the SNN model with a pre-trained teacher model. However, additional teacher models require significant computational resources, and it is tedious to manually define the appropriate teacher network architecture. In this paper, we explore cost-effective self-distillation learning of SNNs to circumvent these concerns. Without an explicit defined teacher, the SNN generates pseudo-labels and learns consistency during training. On the one hand, we extend the timestep of the SNN during training to create an implicit temporal ``teacher" that guides the learning of the original ``student", i.e., the temporal self-distillation. On the other hand, we guide the output of the weak classifier at the intermediate stage by the final output of the SNN, i.e., the spatial self-distillation. Our temporal-spatial self-distillation (TSSD) learning method does not introduce any inference overhead and has excellent generalization ability. Extensive experiments on the static image datasets CIFAR10/100 and ImageNet as well as the neuromorphic datasets CIFAR10-DVS and DVS-Gesture validate the superior performance of the TSSD method. This paper presents a novel manner of fusing SNNs with KD, providing insights into high-performance SNN learning methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes