CV AIJul 22, 2024

StreamTinyNet: video streaming analysis with spatial-temporal TinyML

Hazem Hesham Yousef Shalby, Massimo Pavan, Manuel Roveri

arXiv:2407.17524v13.73 citationsh-index: 6

Originality Highly original

AI Analysis

This enables spatial-temporal video analysis on resource-constrained TinyML devices, addressing a previously impossible task for embedded systems.

The paper tackles the problem of video streaming analysis on TinyML devices, which previously only performed frame-by-frame analysis, by introducing StreamTinyNet, the first architecture for multiple-frame spatial-temporal analysis, demonstrating effectiveness on public datasets and feasibility on an Arduino Nicla Vision device.

Tiny Machine Learning (TinyML) is a branch of Machine Learning (ML) that constitutes a bridge between the ML world and the embedded system ecosystem (i.e., Internet of Things devices, embedded devices, and edge computing units), enabling the execution of ML algorithms on devices constrained in terms of memory, computational capabilities, and power consumption. Video Streaming Analysis (VSA), one of the most interesting tasks of TinyML, consists in scanning a sequence of frames in a streaming manner, with the goal of identifying interesting patterns. Given the strict constraints of these tiny devices, all the current solutions rely on performing a frame-by-frame analysis, hence not exploiting the temporal component in the stream of data. In this paper, we present StreamTinyNet, the first TinyML architecture to perform multiple-frame VSA, enabling a variety of use cases that requires spatial-temporal analysis that were previously impossible to be carried out at a TinyML level. Experimental results on public-available datasets show the effectiveness and efficiency of the proposed solution. Finally, StreamTinyNet has been ported and tested on the Arduino Nicla Vision, showing the feasibility of what proposed.

View on arXiv PDF

Similar