NEAug 17, 2024
Toward End-to-End Bearing Fault Diagnosis for Industrial Scenarios with Spiking Neural NetworksLin Zuo, Yongqi Ding, Mengmeng Jing et al.
This paper explores the application of spiking neural networks (SNNs), known for their low-power binary spikes, to bearing fault diagnosis, bridging the gap between high-performance AI algorithms and real-world industrial scenarios. In particular, we identify two key limitations of existing SNN fault diagnosis methods: inadequate encoding capacity that necessitates cumbersome data preprocessing, and non-spike-oriented architectures that constrain the performance of SNNs. To alleviate these problems, we propose a Multi-scale Residual Attention SNN (MRA-SNN) to simultaneously improve the efficiency, performance, and robustness of SNN methods. By incorporating a lightweight attention mechanism, we have designed a multi-scale attention encoding module to extract multiscale fault features from vibration signals and encode them as spatio-temporal spikes, eliminating the need for complicated preprocessing. Then, the spike residual attention block extracts high-dimensional fault features and enhances the expressiveness of sparse spikes with the attention mechanism for end-to-end diagnosis. In addition, the performance and robustness of MRA-SNN is further enhanced by introducing the lightweight attention mechanism within the spiking neurons to simulate the biological dendritic filtering effect. Extensive experiments on MFPT, JNU, Bearing, and Gearbox benchmark datasets demonstrate that MRA-SNN significantly outperforms existing methods in terms of accuracy, energy consumption, and noise robustness, and is more feasible for deployment in real-world industrial scenarios.
AIAug 17, 2024
Temporal Reversal Regularization for Spiking Neural Networks: Hybrid Spatio-Temporal Invariance for GeneralizationLin Zuo, Yongqi Ding, Wenwei Luo et al.
Spiking neural networks (SNNs) have received widespread attention as an ultra-low power computing paradigm. Recent studies have shown that SNNs suffer from severe overfitting, which limits their generalization performance. In this paper, we propose a simple yet effective Temporal Reversal Regularization (TRR) to mitigate overfitting during training and facilitate generalization of SNNs. We exploit the inherent temporal properties of SNNs to perform input/feature temporal reversal perturbations, prompting the SNN to produce original-reversed consistent outputs and learn perturbation-invariant representations. To further enhance generalization, we utilize the lightweight ``star operation" (Hadamard product) for feature hybridization of original and temporally reversed spike firing rates, which expands the implicit dimensionality and acts as a spatio-temporal regularizer. We show theoretically that our method is able to tighten the upper bound of the generalization error, and extensive experiments on static/neuromorphic recognition as well as 3D point cloud classification tasks demonstrate its effectiveness, versatility, and adversarial robustness. In particular, our regularization significantly improves the recognition accuracy of low-latency SNN for neuromorphic objects, contributing to the real-world deployment of neuromorphic computational software-hardware integration.
91.5NEMar 12
Stable Spike: Dual Consistency Optimization via Bitwise AND Operations for Spiking Neural NetworksYongqi Ding, Kunshan Yang, Linze Li et al.
Although the temporal spike dynamics of spiking neural networks (SNNs) enable low-power temporal pattern capture capabilities, they also incur inherent inconsistencies that severely compromise representation. In this paper, we perform dual consistency optimization via Stable Spike to mitigate this problem, thereby improving the recognition performance of SNNs. With the hardware-friendly ``AND" bit operation, we efficiently decouple the stable spike skeleton from the multi-timestep spike maps, thereby capturing critical semantics while reducing inconsistencies from variable noise spikes. Enforcing the unstable spike maps to converge to the stable spike skeleton significantly improves the inherent consistency across timesteps. Furthermore, we inject amplitude-aware spike noise into the stable spike skeleton to diversify the representations while preserving consistent semantics. The SNN is encouraged to produce perturbation-consistent predictions, thereby contributing to generalization. Extensive experiments across multiple architectures and datasets validate the effectiveness and versatility of our method. In particular, our method significantly advances neuromorphic object recognition under ultra-low latency, improving accuracy by up to 8.33\%. This will help unlock the full power consumption and speed potential of SNNs.
CVMar 28, 2025
A Semantic-Enhanced Heterogeneous Graph Learning Method for Flexible Objects RecognitionKunshan Yang, Wenwei Luo, Yuguo Hu et al.
Flexible objects recognition remains a significant challenge due to its inherently diverse shapes and sizes, translucent attributes, and subtle inter-class differences. Graph-based models, such as graph convolution networks and graph vision models, are promising in flexible objects recognition due to their ability of capturing variable relations within the flexible objects. These methods, however, often focus on global visual relationships or fail to align semantic and visual information. To alleviate these limitations, we propose a semantic-enhanced heterogeneous graph learning method. First, an adaptive scanning module is employed to extract discriminative semantic context, facilitating the matching of flexible objects with varying shapes and sizes while aligning semantic and visual nodes to enhance cross-modal feature correlation. Second, a heterogeneous graph generation module aggregates global visual and local semantic node features, improving the recognition of flexible objects. Additionally, We introduce the FSCW, a large-scale flexible dataset curated from existing sources. We validate our method through extensive experiments on flexible datasets (FDA and FSCW), and challenge benchmarks (CIFAR-100 and ImageNet-Hard), demonstrating competitive performance.
LGOct 9, 2025
Synergy Between the Strong and the Weak: Spiking Neural Networks are Inherently Self-DistillersYongqi Ding, Lin Zuo, Mengmeng Jing et al.
Brain-inspired spiking neural networks (SNNs) promise to be a low-power alternative to computationally intensive artificial neural networks (ANNs), although performance gaps persist. Recent studies have improved the performance of SNNs through knowledge distillation, but rely on large teacher models or introduce additional training overhead. In this paper, we show that SNNs can be naturally deconstructed into multiple submodels for efficient self-distillation. We treat each timestep instance of the SNN as a submodel and evaluate its output confidence, thus efficiently identifying the strong and the weak. Based on this strong and weak relationship, we propose two efficient self-distillation schemes: (1) \textbf{Strong2Weak}: During training, the stronger "teacher" guides the weaker "student", effectively improving overall performance. (2) \textbf{Weak2Strong}: The weak serve as the "teacher", distilling the strong in reverse with underlying dark knowledge, again yielding significant performance gains. For both distillation schemes, we offer flexible implementations such as ensemble, simultaneous, and cascade distillation. Experiments show that our method effectively improves the discriminability and overall performance of the SNN, while its adversarial robustness is also enhanced, benefiting from the stability brought by self-distillation. This ingeniously exploits the temporal properties of SNNs and provides insight into how to efficiently train high-performance SNNs.
LGJun 12, 2024
Self-Distillation Learning Based on Temporal-Spatial Consistency for Spiking Neural NetworksLin Zuo, Yongqi Ding, Mengmeng Jing et al.
Spiking neural networks (SNNs) have attracted considerable attention for their event-driven, low-power characteristics and high biological interpretability. Inspired by knowledge distillation (KD), recent research has improved the performance of the SNN model with a pre-trained teacher model. However, additional teacher models require significant computational resources, and it is tedious to manually define the appropriate teacher network architecture. In this paper, we explore cost-effective self-distillation learning of SNNs to circumvent these concerns. Without an explicit defined teacher, the SNN generates pseudo-labels and learns consistency during training. On the one hand, we extend the timestep of the SNN during training to create an implicit temporal ``teacher" that guides the learning of the original ``student", i.e., the temporal self-distillation. On the other hand, we guide the output of the weak classifier at the intermediate stage by the final output of the SNN, i.e., the spatial self-distillation. Our temporal-spatial self-distillation (TSSD) learning method does not introduce any inference overhead and has excellent generalization ability. Extensive experiments on the static image datasets CIFAR10/100 and ImageNet as well as the neuromorphic datasets CIFAR10-DVS and DVS-Gesture validate the superior performance of the TSSD method. This paper presents a novel manner of fusing SNNs with KD, providing insights into high-performance SNN learning methods.
CVJun 6, 2024
Flexible ViG: Learning the Self-Saliency for Flexible Object RecognitionLin Zuo, Kunshan Yang, Xianlong Tian et al.
Existing computer vision methods mainly focus on the recognition of rigid objects, whereas the recognition of flexible objects remains unexplored. Recognizing flexible objects poses significant challenges due to their inherently diverse shapes and sizes, translucent attributes, ambiguous boundaries, and subtle inter-class differences. In this paper, we claim that these problems primarily arise from the lack of object saliency. To this end, we propose the Flexible Vision Graph Neural Network (FViG) to optimize the self-saliency and thereby improve the discrimination of the representations for flexible objects. Specifically, on one hand, we propose to maximize the channel-aware saliency by extracting the weight of neighboring nodes, which adapts to the shape and size variations in flexible objects. On the other hand, we maximize the spatial-aware saliency based on clustering to aggregate neighborhood information for the centroid nodes, which introduces local context information for the representation learning. To verify the performance of flexible objects recognition thoroughly, for the first time we propose the Flexible Dataset (FDA), which consists of various images of flexible objects collected from real-world scenarios or online. Extensive experiments evaluated on our Flexible Dataset demonstrate the effectiveness of our method on enhancing the discrimination of flexible objects.