NECVLGIVFeb 5, 2025

Spiking Neural Network Feature Discrimination Boosts Modality Fusion

arXiv:2502.10423v11 citationsh-index: 12IEEE Trans Cogn Dev Syst
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient and effective feature representation for multi-modal tasks, though it appears incremental as it applies SNNs to a known bottleneck.

The paper tackles the problem of feature discrimination in multi-modal learning by proposing a spiking neural network (SNN) approach for audio-visual data, achieving competitive classification results compared to existing works.

Feature discrimination is a crucial aspect of neural network design, as it directly impacts the network's ability to distinguish between classes and generalize across diverse datasets. The accomplishment of achieving high-quality feature representations ensures high intra-class separability and poses one of the most challenging research directions. While conventional deep neural networks (DNNs) rely on complex transformations and very deep networks to come up with meaningful feature representations, they usually require days of training and consume significant energy amounts. To this end, spiking neural networks (SNNs) offer a promising alternative. SNN's ability to capture temporal and spatial dependencies renders them particularly suitable for complex tasks, where multi-modal data are required. In this paper, we propose a feature discrimination approach for multi-modal learning with SNNs, focusing on audio-visual data. We employ deep spiking residual learning for visual modality processing and a simpler yet efficient spiking network for auditory modality processing. Lastly, we deploy a spiking multilayer perceptron for modality fusion. We present our findings and evaluate our approach against similar works in the field of classification challenges. To the best of our knowledge, this is the first work investigating feature discrimination in SNNs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes