Fixed-Point Convolutional Neural Network for Real-Time Video Processing in FPGA
This addresses the problem of real-time video processing for embedded systems, though it appears incremental as it builds on existing mobile neural network approaches.
The paper tackles the challenge of implementing neural networks on FPGAs for real-time video processing by proposing a fixed-point architecture with reduced weights and hardware optimizations, achieving effective performance on cheap FPGAs.
Modern mobile neural networks with a reduced number of weights and parameters do a good job with image classification tasks, but even they may be too complex to be implemented in an FPGA for video processing tasks. The article proposes neural network architecture for the practical task of recognizing images from a camera, which has several advantages in terms of speed. This is achieved by reducing the number of weights, moving from a floating-point to a fixed-point arithmetic, and due to a number of hardware-level optimizations associated with storing weights in blocks, a shift register, and an adjustable number of convolutional blocks that work in parallel. The article also proposed methods for adapting the existing data set for solving a different task. As the experiments showed, the proposed neural network copes well with real-time video processing even on the cheap FPGAs.