LGPFJan 14, 2020

A C Code Generator for Fast Inference and Simple Deployment of Convolutional Neural Networks on Resource Constrained Systems

arXiv:2001.05572v15 citations
Originality Incremental advance
AI Analysis

This enables fast and simple deployment of CNNs in robotics and embedded devices where energy, space, and cost constraints prevent GPU use, though it is incremental as it builds on existing code generation and optimization techniques.

The paper tackles the problem of running convolutional neural networks on resource-constrained systems without GPUs or deep learning frameworks by introducing a neural network code generator that produces optimized ANSI C code, achieving speed-ups of up to 11.81 compared to TensorFlow XLA and Glow and outperforming GPUs in latency.

Inference of Convolutional Neural Networks in time critical applications usually requires a GPU. In robotics or embedded devices these are often not available due to energy, space and cost constraints. Furthermore, installation of a deep learning framework or even a native compiler on the target platform is not possible. This paper presents a neural network code generator (NNCG) that generates from a trained CNN a plain ANSI C code file that encapsulates the inference in single a function. It can easily be included in existing projects and due to lack of dependencies, cross compilation is usually possible. Additionally, the code generation is optimized based on the known trained CNN and target platform following four design principles. The system is evaluated utilizing small CNN designed for this application. Compared to TensorFlow XLA and Glow speed-ups of up to 11.81 can be shown and even GPUs are outperformed regarding latency.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes