An Ultra-low Power TinyML System for Real-time Visual Processing at Edge
This addresses the problem of high power consumption for real-time visual processing at the edge, offering an incremental improvement in efficiency for embedded AI applications.
The paper tackles the challenge of running AI workloads on resource-constrained edge devices by proposing a TinyML system with a tiny CNN backbone, neural co-processor, and custom instruction set, achieving 160mW power consumption at 30FPS for object detection and recognition.
Tiny machine learning (TinyML), executing AI workloads on resource and power strictly restricted systems, is an important and challenging topic. This brief firstly presents an extremely tiny backbone to construct high efficiency CNN models for various visual tasks. Then, a specially designed neural co-processor (NCP) is interconnected with MCU to build an ultra-low power TinyML system, which stores all features and weights on chip and completely removes both of latency and power consumption in off-chip memory access. Furthermore, an application specific instruction-set is further presented for realizing agile development and rapid deployment. Extensive experiments demonstrate that the proposed TinyML system based on our model, NCP and instruction set yields considerable accuracy and achieves a record ultra-low power of 160mW while implementing object detection and recognition at 30FPS. The demo video is available on \url{https://www.youtube.com/watch?v=mIZPxtJ-9EY}.