LG CVMay 15, 2025

A probabilistic framework for dynamic quantization

Gabriele Santini, Francesco Paissan, Elisabetta Farella

arXiv:2505.10689v17.11 citationsh-index: 8

Originality Incremental advance

AI Analysis

This work addresses efficient quantization for neural networks, offering a domain-specific improvement in computer vision applications.

The authors tackled the problem of dynamic quantization in neural networks by proposing a probabilistic framework that adapts quantization parameters per input with minimal computational overhead, achieving negligible performance loss on computer vision tasks.

We propose a probabilistic framework for dynamic quantization of neural networks that allows for a computationally efficient input-adaptive rescaling of the quantization parameters. Our framework applies a probabilistic model to the network's pre-activations through a lightweight surrogate, enabling the adaptive adjustment of the quantization parameters on a per-input basis without significant memory overhead. We validate our approach on a set of popular computer vision tasks and models, observing only a negligible loss in performance. Our method strikes the best performance and computational overhead tradeoff compared to standard quantization strategies.

View on arXiv PDF

Similar