An FPGA-based Solution for Convolution Operation Acceleration
It provides a hardware acceleration solution for edge-AI applications, though it is incremental as it applies existing FPGA methods to a known bottleneck.
This paper tackles the problem of accelerating convolution operations in Convolutional Neural Networks by proposing an FPGA-based architecture, achieving 0.224 GOPS per core and up to 4.48 GOPS when fully utilized on an edge computing board.
Hardware-based acceleration is an extensive attempt to facilitate many computationally-intensive mathematics operations. This paper proposes an FPGA-based architecture to accelerate the convolution operation - a complex and expensive computing step that appears in many Convolutional Neural Network models. We target the design to the standard convolution operation, intending to launch the product as an edge-AI solution. The project's purpose is to produce an FPGA IP core that can process a convolutional layer at a time. System developers can deploy the IP core with various FPGA families by using Verilog HDL as the primary design language for the architecture. The experimental results show that our single computing core synthesized on a simple edge computing FPGA board can offer 0.224 GOPS. When the board is fully utilized, 4.48 GOPS can be achieved.