PLLGMar 8, 2021

Compiler Toolchains for Deep Learning Workloads on Embedded Platforms

arXiv:2104.04576v1Has Code
AI Analysis

It addresses the need for efficient deep learning deployment on mobile and embedded systems, but is incremental as it builds on existing toolchains.

The paper surveys and benchmarks existing open-source deep learning compiler toolchains for embedded platforms, and implements a compilation flow for heterogeneous devices to guide hardware developers.

As the usage of deep learning becomes increasingly popular in mobile and embedded solutions, it is necessary to convert the framework-specific network representations into executable code for these embedded platforms. This paper consists of two parts: The first section is made up of a survey and benchmark of the available open source deep learning compiler toolchains, which focus on the capabilities and performance of the individual solutions in regard to targeting embedded devices and microcontrollers that are combined with a dedicated accelerator in a heterogeneous fashion. The second part explores the implementation and evaluation of a compilation flow for such a heterogeneous device and reuses one of the existing toolchains to demonstrate the necessary steps for hardware developers that plan to build a software flow for their own hardware.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes