CVNov 25, 2025

Image-Free Timestep Distillation via Continuous-Time Consistency with Trajectory-Sampled Pairs

Bao Tang, Shuai Zhang, Yueting Zhu, Jijun Xiang, Xin Yang, Li Yu, Wenyu Liu, Xinggang Wang

arXiv:2511.20410v1Has Code

Originality Incremental advance

AI Analysis

This work addresses efficiency and scalability issues in diffusion model distillation for resource-constrained scenarios, representing an incremental improvement over existing methods.

The paper tackles the problem of reducing the reliance on external training data and computational resources in continuous-time consistency distillation for diffusion models by proposing the Trajectory-Backward Consistency Model (TBCM), which extracts latent representations from the teacher model's trajectory, achieving 6.52 FID and 28.08 CLIP scores on MJHQ-30k with one-step generation while cutting training time by about 40% and saving GPU memory.

Timestep distillation is an effective approach for improving the generation efficiency of diffusion models. The Consistency Model (CM), as a trajectory-based framework, demonstrates significant potential due to its strong theoretical foundation and high-quality few-step generation. Nevertheless, current continuous-time consistency distillation methods still rely heavily on training data and computational resources, hindering their deployment in resource-constrained scenarios and limiting their scalability to diverse domains. To address this issue, we propose Trajectory-Backward Consistency Model (TBCM), which eliminates the dependence on external training data by extracting latent representations directly from the teacher model's generation trajectory. Unlike conventional methods that require VAE encoding and large-scale datasets, our self-contained distillation paradigm significantly improves both efficiency and simplicity. Moreover, the trajectory-extracted samples naturally bridge the distribution gap between training and inference, thereby enabling more effective knowledge transfer. Empirically, TBCM achieves 6.52 FID and 28.08 CLIP scores on MJHQ-30k under one-step generation, while reducing training time by approximately 40% compared to Sana-Sprint and saving a substantial amount of GPU memory, demonstrating superior efficiency without sacrificing quality. We further reveal the diffusion-generation space discrepancy in continuous-time consistency distillation and analyze how sampling strategies affect distillation performance, offering insights for future distillation research. GitHub Link: https://github.com/hustvl/TBCM.

View on arXiv PDF Code

Similar