Sai Deepesh Pokala

2papers

2 Papers

NEAug 8, 2024
A frugal Spiking Neural Network for unsupervised classification of continuous multivariate temporal data

Sai Deepesh Pokala, Marie Bernert, Takuya Nanami et al.

As neural interfaces become more advanced, there has been an increase in the volume and complexity of neural data recordings. These interfaces capture rich information about neural dynamics that call for efficient, real-time processing algorithms to spontaneously extract and interpret patterns of neural dynamics. Moreover, being able to do so in a fully unsupervised manner is critical as patterns in vast streams of neural data might not be easily identifiable by the human eye. Formal Deep Neural Networks (DNNs) have come a long way in performing pattern recognition tasks for various static and sequential pattern recognition applications. However, these networks usually require large labeled datasets for training and have high power consumption preventing their future embedding in active brain implants. An alternative aimed at addressing these issues are Spiking Neural Networks (SNNs) which are neuromorphic and use more biologically plausible neurons with evolving membrane potentials. In this context, we introduce here a frugal single-layer SNN designed for fully unsupervised identification and classification of multivariate temporal patterns in continuous data with a sequential approach. We show that, with only a handful number of neurons, this strategy is efficient to recognize highly overlapping multivariate temporal patterns, first on simulated data, and then on Mel Cepstral representations of speech sounds and finally on multichannel neural data. This approach relies on several biologically inspired plasticity rules, including Spike-timing-dependent plasticity (STDP), Short-term plasticity (STP) and intrinsic plasticity (IP). These results pave the way towards highly frugal SNNs for fully unsupervised and online-compatible learning of complex multivariate temporal patterns for future embedding in dedicated very-low power hardware.

10.5CVMay 22
Exploring deep learning for Event-Based Saliency Prediction with a Transformer-based model

Romaric Mazna, Jean Martinet, Sai Deepesh Pokala

Saliency prediction has been extensively studied in RGB images and videos as a computational model of human visual attention. In contrast, predicting saliency from event-based data remains largely unexplored, despite the biological inspiration and favorable sensing properties of event cameras. Two obstacles have held this direction back: the absence of large-scale event saliency datasets, and the lack of a strong baseline. In this paper, we introduce SEST (Swin Event-based Saliency Transformer), a transformer-based model for saliency prediction from event data, bridging the data scarcity barrier through event-native pretraining and synthetic supervision. SEST leverages a self-supervised pretrained event-based Swin Transformer backbone combined with a lightweight CNN decoder to produce dynamic saliency maps. To address the scarcity of annotated event-based saliency data, we introduce two new benchmark datasets, N-DHF1K and N-UCF Sports, generated from large-scale RGB saliency benchmarks. Experimental results show that SEST clearly outperforms existing event-based saliency methods and narrows the performance gap with state-of-the-art RGB models. Zero-shot evaluation on a real event camera dataset further demonstrates that our model trained on synthetic data remains transferable on real event streams. To the best of our knowledge, this work is the first to apply deep learning to event-based saliency prediction, opening a new research direction at the intersection of event-based vision and neuromorphic visual attention.