FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models
This work addresses the computational bottleneck for users of diffusion models, offering a practical acceleration solution, though it is incremental as it builds on existing methods like reduced NFE.
The paper tackles the high computational cost of diffusion models by introducing FRDiff, a training-free acceleration method that reuses temporally similar feature maps to reduce denoising steps without quality loss, achieving improved latency-fidelity trade-offs in generative tasks.
The substantial computational costs of diffusion models, especially due to the repeated denoising steps necessary for high-quality image generation, present a major obstacle to their widespread adoption. While several studies have attempted to address this issue by reducing the number of score function evaluations (NFE) using advanced ODE solvers without fine-tuning, the decreased number of denoising iterations misses the opportunity to update fine details, resulting in noticeable quality degradation. In our work, we introduce an advanced acceleration technique that leverages the temporal redundancy inherent in diffusion models. Reusing feature maps with high temporal similarity opens up a new opportunity to save computation resources without compromising output quality. To realize the practical benefits of this intuition, we conduct an extensive analysis and propose a novel method, FRDiff. FRDiff is designed to harness the advantages of both reduced NFE and feature reuse, achieving a Pareto frontier that balances fidelity and latency trade-offs in various generative tasks.