Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single-Image Denoising
This addresses the challenge of expensive paired data and information loss in denoising for applications like medical imaging, though it appears incremental as it builds on existing prompt-learning and diffusion methods.
The paper tackles the problem of single-image denoising by proposing Prompt-SID, a self-supervised framework that preserves structural details, achieving state-of-the-art results on synthetic, real-world, and fluorescence imaging datasets.
Many studies have concentrated on constructing supervised models utilizing paired datasets for image denoising, which proves to be expensive and time-consuming. Current self-supervised and unsupervised approaches typically rely on blind-spot networks or sub-image pairs sampling, resulting in pixel information loss and destruction of detailed structural information, thereby significantly constraining the efficacy of such methods. In this paper, we introduce Prompt-SID, a prompt-learning-based single image denoising framework that emphasizes preserving of structural details. This approach is trained in a self-supervised manner using downsampled image pairs. It captures original-scale image information through structural encoding and integrates this prompt into the denoiser. To achieve this, we propose a structural representation generation model based on the latent diffusion process and design a structural attention module within the transformer-based denoiser architecture to decode the prompt. Additionally, we introduce a scale replay training mechanism, which effectively mitigates the scale gap from images of different resolutions. We conduct comprehensive experiments on synthetic, real-world, and fluorescence imaging datasets, showcasing the remarkable effectiveness of Prompt-SID. Our code will be released at https://github.com/huaqlili/Prompt-SID.