CVJul 15, 2024

Integrating Amortized Inference with Diffusion Models for Learning Clean Distribution from Corrupted Images

arXiv:2407.11162v115 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses a practical limitation for researchers and practitioners in fields like computational imaging where obtaining clean data is expensive, though it is an incremental improvement on existing diffusion model methods.

The paper tackles the problem of training diffusion models without requiring large-scale clean data by introducing FlowDiff, a joint training paradigm that uses a conditional normalizing flow with amortized inference to learn from corrupted images. The result is that FlowDiff effectively learns clean distributions from corrupted sources and outperforms baselines with significant margins, while also improving performance in downstream tasks like inpainting and denoising.

Diffusion models (DMs) have emerged as powerful generative models for solving inverse problems, offering a good approximation of prior distributions of real-world image data. Typically, diffusion models rely on large-scale clean signals to accurately learn the score functions of ground truth clean image distributions. However, such a requirement for large amounts of clean data is often impractical in real-world applications, especially in fields where data samples are expensive to obtain. To address this limitation, in this work, we introduce \emph{FlowDiff}, a novel joint training paradigm that leverages a conditional normalizing flow model to facilitate the training of diffusion models on corrupted data sources. The conditional normalizing flow try to learn to recover clean images through a novel amortized inference mechanism, and can thus effectively facilitate the diffusion model's training with corrupted data. On the other side, diffusion models provide strong priors which in turn improve the quality of image recovery. The flow model and the diffusion model can therefore promote each other and demonstrate strong empirical performances. Our elaborate experiment shows that FlowDiff can effectively learn clean distributions across a wide range of corrupted data sources, such as noisy and blurry images. It consistently outperforms existing baselines with significant margins under identical conditions. Additionally, we also study the learned diffusion prior, observing its superior performance in downstream computational imaging tasks, including inpainting, denoising, and deblurring.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes