Edge-preserving noise for diffusion models

Jente Vandersanden, Sascha Holl, Xingchang Huang, Gurprit Singh

arXiv:2410.0154036.411 citationsh-index: 3

Predicted impact top 81% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This work improves diffusion models for structure-sensitive generation tasks, offering a practical fine-tuning approach for pre-trained systems.

The authors introduce an edge-preserving diffusion process that uses a hybrid noise scheme with an edge-aware scheduler to capture fine structural details while maintaining global performance. The method shows consistent improvements in FID, KID, and CLIP-score, particularly in structure-guided tasks like stroke-to-image synthesis.

Classical diffusion models typically rely on isotropic Gaussian noise, treating all regions uniformly and overlooking structural information important for high-quality generation. We introduce an edge-preserving diffusion process that generalizes isotropic models via a hybrid noise scheme with an edge-aware scheduler that smoothly transitions from edge-preserving to isotropic noise. This enables the model to capture fine structural details while generally maintaining global performance. We evaluate the impact of structure-aware noise in both diffusion and flow-matching frameworks, and show that existing isotropic models can be efficiently fine-tuned with edge-preserving noise, making our framework practical for adapting pre-trained systems. Beyond unconditional generation, our method particularly shows improvements in structure-guided tasks such as stroke-to-image synthesis, improving robustness and perceptual quality, as evidenced by consistent improvements across FID, KID, and CLIP-score.

View on arXiv PDF

Similar