CVDec 25, 2024

Single Trajectory Distillation for Accelerating Image and Video Style Transfer

arXiv:2412.18945v12 citationsh-index: 4Has CodeMM
Originality Incremental advance
AI Analysis

This work addresses the bottleneck of slow diffusion processes in style transfer for real-world applications, representing an incremental improvement over existing acceleration methods.

The paper tackles the computational expense of diffusion-based stylization by proposing single trajectory distillation (STD) to accelerate image and video style transfer, achieving superior style similarity and aesthetic evaluations compared to existing acceleration models.

Diffusion-based stylization methods typically denoise from a specific partial noise state for image-to-image and video-to-video tasks. This multi-step diffusion process is computationally expensive and hinders real-world application. A promising solution to speed up the process is to obtain few-step consistency models through trajectory distillation. However, current consistency models only force the initial-step alignment between the probability flow ODE (PF-ODE) trajectories of the student and the imperfect teacher models. This training strategy can not ensure the consistency of whole trajectories. To address this issue, we propose single trajectory distillation (STD) starting from a specific partial noise state. We introduce a trajectory bank to store the teacher model's trajectory states, mitigating the time cost during training. Besides, we use an asymmetric adversarial loss to enhance the style and quality of the generated images. Extensive experiments on image and video stylization demonstrate that our method surpasses existing acceleration models in terms of style similarity and aesthetic evaluations. Our code and results will be available on the project page: https://single-trajectory-distillation.github.io.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes