WaveSense: Efficient Temporal Convolutions with Spiking Neural Networks for Keyword Spotting
This work addresses the need for efficient signal processing in always-on edge devices, offering a neuromorphic-friendly solution with competitive accuracy, though it is incremental as it builds on existing spiking and WaveNet architectures.
The authors tackled the problem of ultra-low power keyword spotting on edge devices by proposing WaveSense, a spiking neural network inspired by WaveNet, which outperforms other spiking networks and achieves near state-of-the-art performance compared to artificial neural networks like CNNs and LSTMs.
Ultra-low power local signal processing is a crucial aspect for edge applications on always-on devices. Neuromorphic processors emulating spiking neural networks show great computational power while fulfilling the limited power budget as needed in this domain. In this work we propose spiking neural dynamics as a natural alternative to dilated temporal convolutions. We extend this idea to WaveSense, a spiking neural network inspired by the WaveNet architecture. WaveSense uses simple neural dynamics, fixed time-constants and a simple feed-forward architecture and hence is particularly well suited for a neuromorphic implementation. We test the capabilities of this model on several datasets for keyword-spotting. The results show that the proposed network beats the state of the art of other spiking neural networks and reaches near state-of-the-art performance of artificial neural networks such as CNNs and LSTMs.