CVDec 25, 2024

Single Trajectory Distillation for Accelerating Image and Video Style Transfer

Sijie Xu, Runqi Wang, Wei Zhu, Dejia Song, Nemo Chen, Xu Tang, Yao Hu

arXiv:2412.18945v13.72 citationsh-index: 4Has CodeMM

Originality Incremental advance

AI Analysis

This work addresses the bottleneck of slow diffusion processes in style transfer for real-world applications, representing an incremental improvement over existing acceleration methods.

The paper tackles the computational expense of diffusion-based stylization by proposing single trajectory distillation (STD) to accelerate image and video style transfer, achieving superior style similarity and aesthetic evaluations compared to existing acceleration models.

Diffusion-based stylization methods typically denoise from a specific partial noise state for image-to-image and video-to-video tasks. This multi-step diffusion process is computationally expensive and hinders real-world application. A promising solution to speed up the process is to obtain few-step consistency models through trajectory distillation. However, current consistency models only force the initial-step alignment between the probability flow ODE (PF-ODE) trajectories of the student and the imperfect teacher models. This training strategy can not ensure the consistency of whole trajectories. To address this issue, we propose single trajectory distillation (STD) starting from a specific partial noise state. We introduce a trajectory bank to store the teacher model's trajectory states, mitigating the time cost during training. Besides, we use an asymmetric adversarial loss to enhance the style and quality of the generated images. Extensive experiments on image and video stylization demonstrate that our method surpasses existing acceleration models in terms of style similarity and aesthetic evaluations. Our code and results will be available on the project page: https://single-trajectory-distillation.github.io.

View on arXiv PDF

Similar