Q-Drift: Quantization-Aware Drift Correction for Diffusion Model Sampling
This addresses a practical deployment issue for large diffusion models by enabling efficient quantization with minimal quality loss, though it is incremental as it builds on existing PTQ methods.
The paper tackles the problem of quantization noise degrading generation quality in diffusion models by proposing Q-Drift, a sampler-side correction that improves FID by up to 4.59 on PixArt-Sigma while preserving CLIP scores.
Post-training quantization (PTQ) is a practical path to deploy large diffusion models, but quantization noise can accumulate over the denoising trajectory and degrade generation quality. We propose Q-Drift, a principled sampler-side correction that treats quantization error as an implicit stochastic perturbation on each denoising step and derives a marginal-distribution-preserving drift adjustment. Q-Drift estimates a timestep-wise variance statistic from calibration, in practice requiring as few as 5 paired full-precision/quantized calibration runs. The resulting sampler correction is plug-and-play with common samplers, diffusion models, and PTQ methods, while incurring negligible overhead at inference. Across six diverse text-to-image models (spanning DiT and U-Net), three samplers (Euler, flow-matching, DPM-Solver++), and two PTQ methods (SVDQuant, MixDQ), Q-Drift improves FID over the corresponding quantized baseline in most settings, with up to 4.59 FID reduction on PixArt-Sigma (SVDQuant W3A4), while preserving CLIP scores.