Trajectory-Consistent Calibration for Cache-Accelerated Diffusion Models
For practitioners using cache-accelerated diffusion models, TCC improves generation quality without retraining, addressing a key bottleneck in efficient inference.
Cache-based acceleration for diffusion transformers reduces inference cost but degrades quality due to representation deviations. The proposed Trajectory-Consistent Calibration (TCC) calibrates cached representations to account for trajectory shifts, improving FID from 29.83 to 27.35 on PixArt-alpha, slightly surpassing full-computation baseline.
Diffusion Transformers require repeated denoiser evaluations during iterative sampling, making inference computationally expensive. Cache-based acceleration reduces this cost by reusing intermediate representations across denoising steps, but can introduce representation deviations and degrade generation quality. In this paper, we analyze these deviations and show that effective calibration should consider both the direct mismatch caused by reuse and the subsequent trajectory shift induced by earlier corrections. To address this challenge, we propose Trajectory-Consistent Calibration (TCC), a training-free method that calibrates cached representations toward their full-computation counterparts. Specifically, rather than estimating all calibration priors from a single uncorrected cache trajectory, TCC uses an offline iterative procedure so that each prior accounts for the trajectory shift induced by preceding calibrations. Experiments on PixArt-alpha and DiT-XL/2 show that TCC consistently improves FID across representative cache-based acceleration methods while preserving their underlying reuse policies. Notably, in a representative PixArt-alpha cache-acceleration setting based on FORA, TCC reduces FID from 29.83 to 27.35, slightly surpassing the full-computation baseline.