CVNov 24, 2023

Image Super-Resolution with Text Prompt Diffusion

arXiv:2311.14282v527 citationsh-index: 13Has Code
Originality Incremental advance
AI Analysis

This work addresses performance limitations in image super-resolution for applications like photo enhancement, but it is incremental as it builds on existing multi-modal methods.

The paper tackles the challenge of extracting degradation information in image super-resolution by introducing text prompts as additional priors, achieving impressive results on synthetic and real-world images.

Image super-resolution (SR) methods typically model degradation to improve reconstruction accuracy in complex and unknown degradation scenarios. However, extracting degradation information from low-resolution images is challenging, which limits the model performance. To boost image SR performance, one feasible approach is to introduce additional priors. Inspired by advancements in multi-modal methods and text prompt image processing, we introduce text prompts to image SR to provide degradation priors. Specifically, we first design a text-image generation pipeline to integrate text into the SR dataset through the text degradation representation and degradation model. By adopting a discrete design, the text representation is flexible and user-friendly. Meanwhile, we propose the PromptSR to realize the text prompt SR. The PromptSR leverages the latest multi-modal large language model (MLLM) to generate prompts from low-resolution images. It also utilizes the pre-trained language model (e.g., T5 or CLIP) to enhance text comprehension. We train the PromptSR on the text-image dataset. Extensive experiments indicate that introducing text prompts into SR, yields impressive results on both synthetic and real-world images. Code: https://github.com/zhengchen1999/PromptSR.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes