CVApr 26, 2021

Learning from Event Cameras with Sparse Spiking Convolutional Neural Networks

arXiv:2104.12579v145 citations
Originality Incremental advance
AI Analysis

This addresses energy efficiency for embedded vision systems like in cars, though it is incremental as it builds on existing SNN and event camera methods.

The paper tackles the problem of high power consumption in convolutional neural networks (CNNs) for embedded systems by proposing an end-to-end biologically inspired approach using event cameras and spiking neural networks (SNNs), achieving results in accuracy, sparsity, and training time on the DVS128 Gesture Dataset that enable real-time applications on low-power hardware.

Convolutional neural networks (CNNs) are now the de facto solution for computer vision problems thanks to their impressive results and ease of learning. These networks are composed of layers of connected units called artificial neurons, loosely modeling the neurons in a biological brain. However, their implementation on conventional hardware (CPU/GPU) results in high power consumption, making their integration on embedded systems difficult. In a car for example, embedded algorithms have very high constraints in term of energy, latency and accuracy. To design more efficient computer vision algorithms, we propose to follow an end-to-end biologically inspired approach using event cameras and spiking neural networks (SNNs). Event cameras output asynchronous and sparse events, providing an incredibly efficient data source, but processing these events with synchronous and dense algorithms such as CNNs does not yield any significant benefits. To address this limitation, we use spiking neural networks (SNNs), which are more biologically realistic neural networks where units communicate using discrete spikes. Due to the nature of their operations, they are hardware friendly and energy-efficient, but training them still remains a challenge. Our method enables the training of sparse spiking convolutional neural networks directly on event data, using the popular deep learning framework PyTorch. The performances in terms of accuracy, sparsity and training time on the popular DVS128 Gesture Dataset make it possible to use this bio-inspired approach for the future embedding of real-time applications on low-power neuromorphic hardware.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes