AIMar 17

NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics

arXiv:2603.161483.2h-index: 2

AI Analysis

This addresses the challenge of enabling efficient and biologically plausible language models for AI research, though it is incremental as it builds on existing SNN and state-space methods.

The paper tackled the problem of training a pure spiking neural network (SNN) for large-scale language modeling without Transformer distillation, and the result was that NeuronSpark-0.9B achieved a pretraining loss of 3.6 and showed early multi-turn dialogue behavior after supervised fine-tuning.

We ask whether a pure spiking backbone can learn large-scale language modeling from random initialization, without Transformer distillation. We introduce NeuronSpark, a 0.9B-parameter SNN language model trained with next-token prediction and surrogate gradients. The model combines selective state-space spiking dynamics, leakage-current inter-layer communication, PonderNet adaptive timesteps, fused Triton PLIF kernels, and stabilization techniques (residual centering, lateral-inhibition normalization, and natural-gradient compensation). Under a constrained budget (about 1.4B pretraining tokens and 6.5K SFT steps), NeuronSpark-0.9B reaches 3.6 pretraining loss and shows early multi-turn dialogue behavior after SFT. These results support the feasibility of end-to-end language modeling with a pure SNN architecture at this scale.

View on arXiv PDF

Similar