Resource-Efficient Gesture Recognition using Low-Resolution Thermal Camera via Spiking Neural Networks and Sparse Segmentation
This provides a resource-efficient solution for gesture recognition in automotive settings, addressing lighting insensitivity and cost, though it is incremental in applying SNNs to this domain.
The paper tackles hand gesture recognition using a low-resolution thermal camera and a novel Spiking Neural Network (SNN) with Sparse Segmentation, achieving 93.9% accuracy on a 5-class dataset in an automotive context while reducing memory and compute complexity by over an order of magnitude compared to deep learning.
This work proposes a novel approach for hand gesture recognition using an inexpensive, low-resolution (24 x 32) thermal sensor processed by a Spiking Neural Network (SNN) followed by Sparse Segmentation and feature-based gesture classification via Robust Principal Component Analysis (R-PCA). Compared to the use of standard RGB cameras, the proposed system is insensitive to lighting variations while being significantly less expensive compared to high-frequency radars, time-of-flight cameras and high-resolution thermal sensors previously used in literature. Crucially, this paper shows that the innovative use of the recently proposed Monostable Multivibrator (MMV) neural networks as a new class of SNN achieves more than one order of magnitude smaller memory and compute complexity compared to deep learning approaches, while reaching a top gesture recognition accuracy of 93.9% using a 5-class thermal camera dataset acquired in a car cabin, within an automotive context. Our dataset is released for helping future research.