Accelerating ODE-Based Neural Networks on Low-Cost FPGAs
This work addresses the problem of accelerating ODE-based neural networks for resource-limited edge devices, specifically low-cost FPGAs, which is relevant for embedded AI applications.
This paper proposes reduced ODENet (rODENet) variants for efficient implementation on low-cost FPGAs. By implementing a part of rODENet with a dedicated logic on a PYNQ-Z2 board, the overall execution time is improved by up to 2.66 times compared to a pure software execution, while maintaining comparable accuracy to the original ODENet.
ODENet is a deep neural network architecture in which a stacking structure of ResNet is implemented with an ordinary differential equation (ODE) solver. It can reduce the number of parameters and strike a balance between accuracy and performance by selecting a proper solver. It is also possible to improve the accuracy while keeping the same number of parameters on resource-limited edge devices. In this paper, using Euler method as an ODE solver, a part of ODENet is implemented as a dedicated logic on a low-cost FPGA (Field-Programmable Gate Array) board, such as PYNQ-Z2 board. As ODENet variants, reduced ODENets (rODENets) each of which heavily uses a part of ODENet layers and reduces/eliminates some layers differently are proposed and analyzed for low-cost FPGA implementation. They are evaluated in terms of parameter size, accuracy, execution time, and resource utilization on the FPGA. The results show that an overall execution time of an rODENet variant is improved by up to 2.66 times compared to a pure software execution while keeping a comparable accuracy to the original ODENet.