LGDec 11, 2025

Uncertainty-Preserving QBNNs: Multi-Level Quantization of SVI-Based Bayesian Neural Networks for Image Classification

Hendrik Borras, Yong Wu, Bernhard Klein, Holger Fröning

arXiv:2512.10602v14.1h-index: 4

Originality Incremental advance

AI Analysis

This enables deployment of BNNs on resource-constrained edge devices, though it is incremental as it applies quantization to an existing probabilistic method.

The authors tackled the computational and memory overhead of Bayesian Neural Networks (BNNs) by developing a multi-level quantization framework, achieving up to 8x memory reduction at 4-bit precision while maintaining classification accuracy and uncertainty estimation on Dirty-MNIST.

Bayesian Neural Networks (BNNs) provide principled uncertainty quantification but suffer from substantial computational and memory overhead compared to deterministic networks. While quantization techniques have successfully reduced resource requirements in standard deep learning models, their application to probabilistic models remains largely unexplored. We introduce a systematic multi-level quantization framework for Stochastic Variational Inference based BNNs that distinguishes between three quantization strategies: Variational Parameter Quantization (VPQ), Sampled Parameter Quantization (SPQ), and Joint Quantization (JQ). Our logarithmic quantization for variance parameters, and specialized activation functions to preserve the distributional structure are essential for calibrated uncertainty estimation. Through comprehensive experiments on Dirty-MNIST, we demonstrate that BNNs can be quantized down to 4-bit precision while maintaining both classification accuracy and uncertainty disentanglement. At 4 bits, Joint Quantization achieves up to 8x memory reduction compared to floating-point implementations with minimal degradation in epistemic and aleatoric uncertainty estimation. These results enable deployment of BNNs on resource-constrained edge devices and provide design guidelines for future analog "Bayesian Machines" operating at inherently low precision.

View on arXiv PDF

Similar