CV AI LGAug 15, 2023

Dynamic Attention-Guided Diffusion for Image Super-Resolution

Brian B. Moser, Stanislav Frolov, Federico Raue, Sebastian Palacio, Andreas Dengel

arXiv:2308.07977v49.110 citationsh-index: 59

Originality Incremental advance

AI Analysis

This work addresses image quality issues in super-resolution for applications like photography and computer vision, representing an incremental improvement over existing diffusion-based methods.

The paper tackled the problem of diffusion models in image super-resolution treating all regions uniformly, which can introduce artifacts, by proposing YODA, a dynamic attention-guided diffusion process that selectively focuses on detail-rich areas, achieving new state-of-the-art performances in face and general SR tasks with improvements in PSNR, SSIM, and LPIPS metrics.

Diffusion models in image Super-Resolution (SR) treat all image regions uniformly, which risks compromising the overall image quality by potentially introducing artifacts during denoising of less-complex regions. To address this, we propose ``You Only Diffuse Areas'' (YODA), a dynamic attention-guided diffusion process for image SR. YODA selectively focuses on spatial regions defined by attention maps derived from the low-resolution images and the current denoising time step. This time-dependent targeting enables a more efficient conversion to high-resolution outputs by focusing on areas that benefit the most from the iterative refinement process, i.e., detail-rich objects. We empirically validate YODA by extending leading diffusion-based methods SR3, DiffBIR, and SRDiff. Our experiments demonstrate new state-of-the-art performances in face and general SR tasks across PSNR, SSIM, and LPIPS metrics. As a side effect, we find that YODA reduces color shift issues and stabilizes training with small batches.

View on arXiv PDF

Similar