Minchi Hu

h-index25
2papers

2 Papers

4.3NEApr 22
Spatio-Temporal Cluster-Triggered Encoding for Spiking Neural Networks

Minchi Hu

Encoding static images into spike trains is a fundamental step for enabling Spiking Neural Networks (SNNs) to process visual information. However, widely used methods such as rate coding, Poisson encoding, and time-to-first-spike (TTFS) often neglect spatial correlations and produce temporally inconsistent spike patterns, limiting both efficiency and interpretability. In this work, we propose a novel cluster-based encoding framework that explicitly preserves semantic structure across both spatial and temporal domains. The method first introduces a 2D spatial clustering mechanism, which leverages connected component analysis and local density estimation to identify salient foreground regions. Building upon this, we extend the approach to a 3D spatio-temporal (ST3D) encoding scheme that incorporates temporal neighborhood information, generating spike trains with enhanced temporal coherence. Experiments on the N-MNIST dataset demonstrate that the proposed ST3D encoder achieves 98.17% classification accuracy using a simple single-layer SNN, outperforming conventional TTFS encoding (97.58%). Notably, this performance is achieved with significantly fewer spikes (3800 vs. 5000 per sample), highlighting improved efficiency without sacrificing accuracy. These results indicate that the proposed method provides an interpretable, structure-aware, and computationally efficient encoding strategy, offering strong potential for neuromorphic computing applications.

CLJul 24, 2025
Deep Learning Approaches for Multimodal Intent Recognition: A Survey

Jingwei Zhao, Yuhua Wen, Qifei Li et al.

Intent recognition aims to identify users' underlying intentions, traditionally focusing on text in natural language processing. With growing demands for natural human-computer interaction, the field has evolved through deep learning and multimodal approaches, incorporating data from audio, vision, and physiological signals. Recently, the introduction of Transformer-based models has led to notable breakthroughs in this domain. This article surveys deep learning methods for intent recognition, covering the shift from unimodal to multimodal techniques, relevant datasets, methodologies, applications, and current challenges. It provides researchers with insights into the latest developments in multimodal intent recognition (MIR) and directions for future research.