Batch Normalization-Free Fully Integer Quantized Neural Networks via Progressive Tandem Learning
This work enables end-to-end integer-only inference for resource-constrained settings like edge and embedded devices, though it is incremental as it builds on existing quantization workflows.
The paper tackles the problem of batch normalization dependency in quantized neural networks, which prevents true integer-only deployment, by introducing a progressive tandem learning method that achieves competitive Top-1 accuracy on ImageNet with AlexNet under aggressive quantization.
Quantised neural networks (QNNs) shrink models and reduce inference energy through low-bit arithmetic, yet most still depend on a running statistics batch normalisation (BN) layer, preventing true integer-only deployment. Prior attempts remove BN by parameter folding or tailored initialisation; while helpful, they rarely recover BN's stability and accuracy and often impose bespoke constraints. We present a BN-free, fully integer QNN trained via a progressive, layer-wise distillation scheme that slots into existing low-bit pipelines. Starting from a pretrained BN-enabled teacher, we use layer-wise targets and progressive compensation to train a student that performs inference exclusively with integer arithmetic and contains no BN operations. On ImageNet with AlexNet, the BN-free model attains competitive Top-1 accuracy under aggressive quantisation. The procedure integrates directly with standard quantisation workflows, enabling end-to-end integer-only inference for resource-constrained settings such as edge and embedded devices.