LGCVMay 15, 2025

A probabilistic framework for dynamic quantization

arXiv:2505.10689v11 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses efficient quantization for neural networks, offering a domain-specific improvement in computer vision applications.

The authors tackled the problem of dynamic quantization in neural networks by proposing a probabilistic framework that adapts quantization parameters per input with minimal computational overhead, achieving negligible performance loss on computer vision tasks.

We propose a probabilistic framework for dynamic quantization of neural networks that allows for a computationally efficient input-adaptive rescaling of the quantization parameters. Our framework applies a probabilistic model to the network's pre-activations through a lightweight surrogate, enabling the adaptive adjustment of the quantization parameters on a per-input basis without significant memory overhead. We validate our approach on a set of popular computer vision tasks and models, observing only a negligible loss in performance. Our method strikes the best performance and computational overhead tradeoff compared to standard quantization strategies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes