CVAILGAug 15, 2023

Dynamic Attention-Guided Diffusion for Image Super-Resolution

arXiv:2308.07977v410 citationsh-index: 59
Originality Incremental advance
AI Analysis

This work addresses image quality issues in super-resolution for applications like photography and computer vision, representing an incremental improvement over existing diffusion-based methods.

The paper tackled the problem of diffusion models in image super-resolution treating all regions uniformly, which can introduce artifacts, by proposing YODA, a dynamic attention-guided diffusion process that selectively focuses on detail-rich areas, achieving new state-of-the-art performances in face and general SR tasks with improvements in PSNR, SSIM, and LPIPS metrics.

Diffusion models in image Super-Resolution (SR) treat all image regions uniformly, which risks compromising the overall image quality by potentially introducing artifacts during denoising of less-complex regions. To address this, we propose ``You Only Diffuse Areas'' (YODA), a dynamic attention-guided diffusion process for image SR. YODA selectively focuses on spatial regions defined by attention maps derived from the low-resolution images and the current denoising time step. This time-dependent targeting enables a more efficient conversion to high-resolution outputs by focusing on areas that benefit the most from the iterative refinement process, i.e., detail-rich objects. We empirically validate YODA by extending leading diffusion-based methods SR3, DiffBIR, and SRDiff. Our experiments demonstrate new state-of-the-art performances in face and general SR tasks across PSNR, SSIM, and LPIPS metrics. As a side effect, we find that YODA reduces color shift issues and stabilizes training with small batches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes