LGAIApr 17

Closing the Theory-Practice Gap in Spiking Transformers via Effective Dimension

arXiv:2604.1576934.2h-index: 2
Predicted impact top 69% in LG · last 90 daysOriginality Highly original
AI Analysis

Provides the first theoretical foundation for designing spiking transformers, addressing a critical gap for neuromorphic computing researchers.

This paper establishes the first comprehensive expressivity theory for spiking self-attention, proving that spiking transformers are universal approximators and deriving tight spike-count lower bounds. The theory explains why T=4 timesteps suffice in practice despite worst-case predictions of T≥10,000, validated across multiple architectures with R²=0.97.

Spiking transformers achieve competitive accuracy with conventional transformers while offering $38$-$57\times$ energy efficiency on neuromorphic hardware, yet no theoretical framework guides their design. This paper establishes the first comprehensive expressivity theory for spiking self-attention. We prove that spiking attention with Leaky Integrate-and-Fire neurons is a universal approximator of continuous permutation-equivariant functions, providing explicit spike circuit constructions including a novel lateral inhibition network for softmax normalization with proven $O(1/\sqrt{T})$ convergence. We derive tight spike-count lower bounds via rate-distortion theory: $\varepsilon$-approximation requires $Ω(L_f^2 nd/\varepsilon^2)$ spikes, with rigorous information-theoretic derivation. Our key insight is input-dependent bounds using measured effective dimensions ($d_{\text{eff}}=47$--$89$ for CIFAR/ImageNet), explaining why $T=4$ timesteps suffice despite worst-case $T \geq 10{,}000$ predictions. We provide concrete design rules with calibrated constants ($C=2.3$, 95\% CI: $[1.9, 2.7]$). Experiments on Spikformer, QKFormer, and SpikingResformer across vision and language benchmarks validate predictions with $R^2=0.97$ ($p<0.001$). Our framework provides the first principled foundation for neuromorphic transformer design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes