CVSep 4, 2018

Towards Efficient Convolutional Neural Network for Domain-Specific Applications on FPGA

arXiv:1809.03318v136 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of inefficient FPGA deployment for domain-specific CNN applications, offering incremental improvements through optimization techniques.

The paper tackles the inefficiency of deploying pre-trained CNN models on FPGAs for domain-specific tasks by introducing TuRF, an end-to-end acceleration framework that uses transfer learning, efficient convolution blocks, and layer fusion, resulting in designs that outperform prior methods in performance and accuracy for models like VGG-16 and ResNet-50.

FPGA becomes a popular technology for implementing Convolutional Neural Network (CNN) in recent years. Most CNN applications on FPGA are domain-specific, e.g., detecting objects from specific categories, in which commonly-used CNN models pre-trained on general datasets may not be efficient enough. This paper presents TuRF, an end-to-end CNN acceleration framework to efficiently deploy domain-specific applications on FPGA by transfer learning that adapts pre-trained models to specific domains, replacing standard convolution layers with efficient convolution blocks, and applying layer fusion to enhance hardware design performance. We evaluate TuRF by deploying a pre-trained VGG-16 model for a domain-specific image recognition task onto a Stratix V FPGA. Results show that designs generated by TuRF achieve better performance than prior methods for the original VGG-16 and ResNet-50 models, while for the optimised VGG-16 model TuRF designs are more accurate and easier to process.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes