LGApr 10, 2023

Deploying Machine Learning Models to Ahead-of-Time Runtime on Edge Using MicroTVM

arXiv:2304.04842v214 citationsh-index: 23
Originality Incremental advance
AI Analysis

This work addresses the challenge for data scientists and developers in seamlessly executing AI models on resource-constrained edge devices, representing an incremental improvement in deployment tooling.

The paper tackled the problem of deploying machine learning models to edge devices by developing an end-to-end code generator using MicroTVM to parse pre-trained models into C source libraries, enabling a hand gesture recognition experiment on an ARM Cortex M4F core with specific compute-intensive operators offloaded to accelerators.

In the past few years, more and more AI applications have been applied to edge devices. However, models trained by data scientists with machine learning frameworks, such as PyTorch or TensorFlow, can not be seamlessly executed on edge. In this paper, we develop an end-to-end code generator parsing a pre-trained model to C source libraries for the backend using MicroTVM, a machine learning compiler framework extension addressing inference on bare metal devices. An analysis shows that specific compute-intensive operators can be easily offloaded to the dedicated accelerator with a Universal Modular Accelerator (UMA) interface, while others are processed in the CPU cores. By using the automatically generated ahead-of-time C runtime, we conduct a hand gesture recognition experiment on an ARM Cortex M4F core.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes