Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors
This work addresses energy-efficient object detection for SNNs, offering a novel method with strong performance gains, though it is incremental in improving existing SNN-based detectors.
The paper tackles the problem of limited expressive power in Spiking Neural Networks (SNNs) for object detection by proposing a Temporal Dynamics Enhancer (TDE) to enhance temporal information modeling, achieving mAP50-95 scores of 57.7% on PASCAL VOC and 47.6% on EvDET200K while reducing attention-related energy consumption to 0.240 times that of conventional methods.
Spiking Neural Networks (SNNs), with their brain-inspired spatiotemporal dynamics and spike-driven computation, have emerged as promising energy-efficient alternatives to Artificial Neural Networks (ANNs). However, existing SNNs typically replicate inputs directly or aggregate them into frames at fixed intervals. Such strategies lead to neurons receiving nearly identical stimuli across time steps, severely limiting the model's expressive power, particularly in complex tasks like object detection. In this work, we propose the Temporal Dynamics Enhancer (TDE) to strengthen SNNs' capacity for temporal information modeling. TDE consists of two modules: a Spiking Encoder (SE) that generates diverse input stimuli across time steps, and an Attention Gating Module (AGM) that guides the SE generation based on inter-temporal dependencies. Moreover, to eliminate the high-energy multiplication operations introduced by the AGM, we propose a Spike-Driven Attention (SDA) to reduce attention-related energy consumption. Extensive experiments demonstrate that TDE can be seamlessly integrated into existing SNN-based detectors and consistently outperforms state-of-the-art methods, achieving mAP50-95 scores of 57.7% on the static PASCAL VOC dataset and 47.6% on the neuromorphic EvDET200K dataset. In terms of energy consumption, the SDA consumes only 0.240 times the energy of conventional attention modules.