CLOct 8, 2021

DPUV3INT8: A Compiler View to programmable FPGA Inference Engines

arXiv:2110.04327v1
Originality Incremental advance
AI Analysis

This work provides a general solution for efficient FPGA-based inference in data centers, though it appears incremental as it builds on existing hand-tuned techniques.

The paper tackled the problem of deploying FPGA inference engines in data centers by developing a compiler for the DPUV3INT8 design, achieving about 1.5 times better performance for Resnet50_v1 and over 80% hardware efficiency across a model zoo.

We have a FPGA design, we make it fast, efficient, and tested for a few important examples. Now we must infer a general solution to deploy in the data center. Here, we describe the FPGA DPUV3INT8 design and our compiler effort. The hand-tuned SW-HW solution for Resnet50\_v1 has (close to) 2 times better images per second (throughput) than our best FPGA implementation; the compiler generalizes the hand written techniques achieving about 1.5 times better performance for the same example, the compiler generalizes the optimizations to a model zoo of networks, and it achieves 80+\% HW efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes