Edge-preserving noise for diffusion models
This work improves diffusion models for structure-sensitive generation tasks, offering a practical fine-tuning approach for pre-trained systems.
The authors introduce an edge-preserving diffusion process that uses a hybrid noise scheme with an edge-aware scheduler to capture fine structural details while maintaining global performance. The method shows consistent improvements in FID, KID, and CLIP-score, particularly in structure-guided tasks like stroke-to-image synthesis.
Classical diffusion models typically rely on isotropic Gaussian noise, treating all regions uniformly and overlooking structural information important for high-quality generation. We introduce an edge-preserving diffusion process that generalizes isotropic models via a hybrid noise scheme with an edge-aware scheduler that smoothly transitions from edge-preserving to isotropic noise. This enables the model to capture fine structural details while generally maintaining global performance. We evaluate the impact of structure-aware noise in both diffusion and flow-matching frameworks, and show that existing isotropic models can be efficiently fine-tuned with edge-preserving noise, making our framework practical for adapting pre-trained systems. Beyond unconditional generation, our method particularly shows improvements in structure-guided tasks such as stroke-to-image synthesis, improving robustness and perceptual quality, as evidenced by consistent improvements across FID, KID, and CLIP-score.