An Event-Based Digital Compute-In-Memory Accelerator with Flexible Operand Resolution and Layer-Wise Weight/Output Stationarity
This work addresses flexibility issues in edge vision applications, offering incremental improvements in energy efficiency and accuracy for SNN accelerators.
The paper tackled the lack of flexibility in compute-in-memory accelerators for spiking neural networks by proposing a digital CIM macro with arbitrary operand resolution and shape, resulting in a 2× increase in bit-normalized energy efficiency and up to 90% energy savings in large-scale systems.
Compute-in-memory (CIM) accelerators for spiking neural networks (SNNs) are promising solutions to enable $μ$s-level inference latency and ultra-low energy in edge vision applications. Yet, their current lack of flexibility at both the circuit and system levels prevents their deployment in a wide range of real-life scenarios. In this work, we propose a novel digital CIM macro that supports arbitrary operand resolution and shape, with a unified CIM storage for weights and membrane potentials. These circuit-level techniques enable a hybrid weight- and output-stationary dataflow at the system level to maximize operand reuse, thereby minimizing costly on- and off-chip data movements during the SNN execution. Measurement results of a fabricated FlexSpIM prototype in 40-nm CMOS demonstrate a 2$\times$ increase in bit-normalized energy efficiency compared to prior fixed-precision digital CIM-SNNs, while providing resolution reconfiguration with bitwise granularity. Our approach can save up to 90% energy in large-scale systems, while reaching a state-of-the-art classification accuracy of 95.8% on the IBM DVS gesture dataset.