CVAILGSep 9, 2024

Real-Time Human Action Recognition on Embedded Platforms

arXiv:2409.05662v28 citationsh-index: 30
Originality Highly original
AI Analysis

It addresses real-time performance challenges for human action recognition on resource-constrained embedded devices, representing an incremental improvement with a novel method for a known bottleneck.

This paper tackled the problem of real-time human action recognition on embedded platforms by identifying optical flow extraction as a latency bottleneck and proposing a novel motion feature extractor, resulting in a system that achieves 30 frames per second with high accuracy.

With advancements in computer vision and deep learning, video-based human action recognition (HAR) has become practical. However, due to the complexity of the computation pipeline, running HAR on live video streams incurs excessive delays on embedded platforms. This work tackles the real-time performance challenges of HAR with four contributions: 1) an experimental study identifying a standard Optical Flow (OF) extraction technique as the latency bottleneck in a state-of-the-art HAR pipeline, 2) an exploration of the latency-accuracy tradeoff between the standard and deep learning approaches to OF extraction, which highlights the need for a novel, efficient motion feature extractor, 3) the design of Integrated Motion Feature Extractor (IMFE), a novel single-shot neural network architecture for motion feature extraction with drastic improvement in latency, 4) the development of RT-HARE, a real-time HAR system tailored for embedded platforms. Experimental results on an Nvidia Jetson Xavier NX platform demonstrated that RT-HARE realizes real-time HAR at a video frame rate of 30 frames per second while delivering high levels of recognition accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes