CVMar 2, 2024

Text-guided Explorable Image Super-resolution

arXiv:2403.01124v112 citationsh-index: 8CVPR
Originality Incremental advance
AI Analysis

This work addresses the need for flexible and user-controllable image super-resolution tools, offering an incremental improvement by adapting existing diffusion models for zero-shot text-guided exploration.

The paper tackles the problem of generating diverse, semantically accurate high-resolution images from low-resolution inputs using text guidance without training on specific degradations, achieving improved restoration quality, diversity, and explorability.

In this paper, we introduce the problem of zero-shot text-guided exploration of the solutions to open-domain image super-resolution. Our goal is to allow users to explore diverse, semantically accurate reconstructions that preserve data consistency with the low-resolution inputs for different large downsampling factors without explicitly training for these specific degradations. We propose two approaches for zero-shot text-guided super-resolution - i) modifying the generative process of text-to-image \textit{T2I} diffusion models to promote consistency with low-resolution inputs, and ii) incorporating language guidance into zero-shot diffusion-based restoration methods. We show that the proposed approaches result in diverse solutions that match the semantic meaning provided by the text prompt while preserving data consistency with the degraded inputs. We evaluate the proposed baselines for the task of extreme super-resolution and demonstrate advantages in terms of restoration quality, diversity, and explorability of solutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes