CVSep 30, 2025

Post-Training Quantization via Residual Truncation and Zero Suppression for Diffusion Models

arXiv:2509.26436v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses deployment challenges for diffusion models by enabling efficient 4-bit quantization, which is incremental but improves upon existing techniques for AI practitioners in image generation.

The paper tackles the challenge of extending post-training quantization to 4 bits for diffusion models, which often lose texture fidelity due to rounding errors, and proposes QuaRTZ, a method that achieves an FID of 6.98 on FLUX.1-schnell, outperforming prior methods like SVDQuant.

Diffusion models achieve high-quality image generation but face deployment challenges due to their high computational requirements. Although 8-bit outlier-aware post-training quantization (PTQ) matches full-precision performance, extending PTQ to 4 bits remains challenging. Larger step sizes in 4-bit quantization amplify rounding errors in dense, low-magnitude activations, leading to the loss of fine-grained textures. We hypothesize that not only outliers but also small activations are critical for texture fidelity. To this end, we propose Quantization via Residual Truncation and Zero Suppression (QuaRTZ), a 4-bit PTQ scheme for diffusion models. QuaRTZ applies 8-bit min-max quantization for outlier handling and compresses to 4 bits via leading-zero suppression to retain LSBs, thereby preserving texture details. Our approach reduces rounding errors and improves quantization efficiency by balancing outlier preservation and LSB precision. Both theoretical derivations and empirical evaluations demonstrate the generalizability of QuaRTZ across diverse activation distributions. Notably, 4-bit QuaRTZ achieves an FID of 6.98 on FLUX.1-schnell, outperforming SVDQuant that requires auxiliary FP16 branches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes