Efficient Synaptic Delay Implementation in Digital Event-Driven AI Accelerators
This work addresses hardware efficiency for neuromorphic computing, offering incremental improvements in memory scaling and energy usage for AI accelerators.
The paper tackles the problem of implementing synaptic delays in digital neuromorphic accelerators by introducing the Shared Circular Delay Queue (SCDQ), a novel hardware structure that improves memory scaling and efficiency, with results showing better memory usage and performance in terms of latency, area, and energy per inference.
Synaptic delay parameterization of neural network models have remained largely unexplored but recent literature has been showing promising results, suggesting the delay parameterized models are simpler, smaller, sparser, and thus more energy efficient than similar performing (e.g. task accuracy) non-delay parameterized ones. We introduce Shared Circular Delay Queue (SCDQ), a novel hardware structure for supporting synaptic delays on digital neuromorphic accelerators. Our analysis and hardware results show that it scales better in terms of memory, than current commonly used approaches, and is more amortizable to algorithm-hardware co-optimizations, where in fact, memory scaling is modulated by model sparsity and not merely network size. Next to memory we also report performance on latency area and energy per inference.