DCARCVNov 20, 2017

Tactics to Directly Map CNN graphs on Embedded FPGAs

arXiv:1712.04322v165 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient CNN deployment on resource-constrained embedded systems, though it appears incremental as it builds on existing FPGA accelerator work.

The paper tackles the problem of directly mapping CNN graphs onto embedded FPGAs for acceleration, demonstrating feasibility and introducing the HADDOC2 tool to automate this process.

Deep Convolutional Neural Networks (CNNs) are the state-of-the-art in image classification. Since CNN feed forward propagation involves highly regular parallel computation, it benefits from a significant speed-up when running on fine grain parallel programmable logic devices. As a consequence, several studies have proposed FPGA-based accelerators for CNNs. However, because of the large computationalpower required by CNNs, none of the previous studies has proposed a direct mapping of the CNN onto the physical resources of an FPGA, allocating each processing actor to its own hardware instance.In this paper, we demonstrate the feasibility of the so called direct hardware mapping (DHM) and discuss several tactics we explore to make DHM usable in practice. As a proof of concept, we introduce the HADDOC2 open source tool, that automatically transforms a CNN description into a synthesizable hardware description with platform-independent direct hardware mapping.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes