On the Effects of Quantisation on Model Uncertainty in Bayesian Neural Networks
This work addresses the practical deployment of BNNs in safety-critical applications by reducing memory and compute demands, though it is incremental as it applies existing quantization methods to BNNs.
The paper investigates the impact of quantizing Bayesian neural networks (BNNs) to reduce computational costs, finding that uniform quantization does not significantly degrade uncertainty estimation quality.
Bayesian neural networks (BNNs) are making significant progress in many research areas where decision-making needs to be accompanied by uncertainty estimation. Being able to quantify uncertainty while making decisions is essential for understanding when the model is over-/under-confident, and hence BNNs are attracting interest in safety-critical applications, such as autonomous driving, healthcare, and robotics. Nevertheless, BNNs have not been as widely used in industrial practice, mainly because of their increased memory and compute costs. In this work, we investigate quantisation of BNNs by compressing 32-bit floating-point weights and activations to their integer counterparts, that has already been successful in reducing the compute demand in standard pointwise neural networks. We study three types of quantised BNNs, we evaluate them under a wide range of different settings, and we empirically demonstrate that a uniform quantisation scheme applied to BNNs does not substantially decrease their quality of uncertainty estimation.