A Spike in Performance: Training Hybrid-Spiking Neural Networks with Quantized Activation Functions
This work addresses the energy efficiency problem in neural networks for the machine learning community, offering a novel method that is not purely incremental but builds on existing SNN concepts with a new training approach.
The authors tackled the challenge of maintaining state-of-the-art accuracy when converting non-spiking networks to spiking neural networks (SNNs) for energy efficiency, achieving a hybrid SNN that outperforms SotA recurrent architectures like LSTM, GRU, and NRU in accuracy while reducing activities to at most 3.74 bits on average with 1.26 significant bits per weight.
The machine learning community has become increasingly interested in the energy efficiency of neural networks. The Spiking Neural Network (SNN) is a promising approach to energy-efficient computing, since its activation levels are quantized into temporally sparse, one-bit values (i.e., "spike" events), which additionally converts the sum over weight-activity products into a simple addition of weights (one weight for each spike). However, the goal of maintaining state-of-the-art (SotA) accuracy when converting a non-spiking network into an SNN has remained an elusive challenge, primarily due to spikes having only a single bit of precision. Adopting tools from signal processing, we cast neural activation functions as quantizers with temporally-diffused error, and then train networks while smoothly interpolating between the non-spiking and spiking regimes. We apply this technique to the Legendre Memory Unit (LMU) to obtain the first known example of a hybrid SNN outperforming SotA recurrent architectures -- including the LSTM, GRU, and NRU -- in accuracy, while reducing activities to at most 3.74 bits on average with 1.26 significant bits multiplying each weight. We discuss how these methods can significantly improve the energy efficiency of neural networks.