LG AIAug 27, 2025

Beacon: Post-Training Quantization with Integrated Grid Selection

arXiv:2508.20293v21 citationsh-index: 24IEEE Signal Processing Letters

Originality Incremental advance

AI Analysis

This provides a practical solution for efficient model deployment by simplifying quantization without manual tuning, though it is incremental as it builds on existing PTQ methods.

The paper tackles the challenge of selecting scaling factors in per-channel post-training quantization by introducing Beacon, a tuning-free algorithm that automatically determines optimal scaling factors using the geometry of scalar quantization, achieving competitive performance with state-of-the-art methods.

Quantization is a widely used compression technique for reducing the memory and computation costs of large pre-trained models. A key challenge in per-channel post-training quantization (PTQ) is selecting appropriate scaling factors to replace weight values with values from a scaled integer grid. Existing methods typically fix the scale at the outset via heuristic tuning or grid search. We propose Beacon, a simple and effective algorithm that eliminates the need for such manual tuning. Beacon performs per-channel PTQ directly using an unscaled grid and automatically determines the optimal scaling factors by exploiting the geometry of scalar quantization. It does not rely on back-propagation or large calibration sets. Despite its simplicity and tuning-free nature, Beacon achieves competitive performance compared to state-of-the-art methods, making it a practical solution for efficient model deployment.

View on arXiv PDF

Similar