A General Neural Network Hardware Architecture on FPGA
This work addresses the problem of hardware efficiency for AI applications like autonomous driving and speech recognition, but it appears incremental as it builds on existing FPGA use in these fields.
The authors tackled the need for energy-efficient and parallel hardware for neural networks by implementing a general neural network architecture on an FPGA SOC platform, achieving high performance in forward and backward algorithms for DNNs with adaptability to different network types and scales.
Field Programmable Gate Arrays (FPGAs) plays an increasingly important role in data sampling and processing industries due to its highly parallel architecture, low power consumption, and flexibility in custom algorithms. Especially, in the artificial intelligence field, for training and implement the neural networks and machine learning algorithms, high energy efficiency hardware implement and massively parallel computing capacity are heavily demanded. Therefore, many global companies have applied FPGAs into AI and Machine learning fields such as autonomous driving and Automatic Spoken Language Recognition (Baidu) [1] [2] and Bing search (Microsoft) [3]. Considering the FPGAs great potential in these fields, we tend to implement a general neural network hardware architecture on XILINX ZU9CG System On Chip (SOC) platform [4], which contains abundant hardware resource and powerful processing capacity. The general neural network architecture on the FPGA SOC platform can perform forward and backward algorithms in deep neural networks (DNN) with high performance and easily be adjusted according to the type and scale of the neural networks.