High-Performance FPGA-based Accelerator for Bayesian Neural Networks
This work addresses the problem of deploying BNNs in safety-critical applications like healthcare and autonomous vehicles by improving hardware efficiency, though it is incremental as it builds on existing Monte Carlo Dropout methods.
The paper tackles the high computational cost and limited hardware performance of Bayesian neural networks (BNNs) by proposing a novel FPGA-based accelerator, achieving up to 4 times higher energy efficiency and 9 times better compute efficiency compared to state-of-the-art BNN accelerators.
Neural networks (NNs) have demonstrated their potential in a wide range of applications such as image recognition, decision making or recommendation systems. However, standard NNs are unable to capture their model uncertainty which is crucial for many safety-critical applications including healthcare and autonomous vehicles. In comparison, Bayesian neural networks (BNNs) are able to express uncertainty in their prediction via a mathematical grounding. Nevertheless, BNNs have not been as widely used in industrial practice, mainly because of their expensive computational cost and limited hardware performance. This work proposes a novel FPGA-based hardware architecture to accelerate BNNs inferred through Monte Carlo Dropout. Compared with other state-of-the-art BNN accelerators, the proposed accelerator can achieve up to 4 times higher energy efficiency and 9 times better compute efficiency. Considering partial Bayesian inference, an automatic framework is proposed, which explores the trade-off between hardware and algorithmic performance. Extensive experiments are conducted to demonstrate that our proposed framework can effectively find the optimal points in the design space.