LGAICVMLOct 2, 2023

Prompt-tuning latent diffusion models for inverse problems

arXiv:2310.01110v177 citationsh-index: 74
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating high-quality images for inverse problems in imaging, such as super-resolution and deblurring, by enhancing latent diffusion models, though it is incremental as it builds on prior diffusion-based methods.

The authors tackled the problem of suboptimal performance in imaging inverse problems when using latent diffusion models with simple null text prompts, by introducing prompt tuning and projection methods to improve image fidelity and reduce artifacts, achieving superior results on tasks like super-resolution, deblurring, and inpainting compared to existing solvers.

We propose a new method for solving imaging inverse problems using text-to-image latent diffusion models as general priors. Existing methods using latent diffusion models for inverse problems typically rely on simple null text prompts, which can lead to suboptimal performance. To address this limitation, we introduce a method for prompt tuning, which jointly optimizes the text embedding on-the-fly while running the reverse diffusion process. This allows us to generate images that are more faithful to the diffusion prior. In addition, we propose a method to keep the evolution of latent variables within the range space of the encoder, by projection. This helps to reduce image artifacts, a major problem when using latent diffusion models instead of pixel-based diffusion models. Our combined method, called P2L, outperforms both image- and latent-diffusion model-based inverse problem solvers on a variety of tasks, such as super-resolution, deblurring, and inpainting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes