CVJul 1, 2024

Blind Inversion using Latent Diffusion Priors

arXiv:2407.01027v112 citationsh-index: 4
Originality Highly original
AI Analysis

This addresses practical limitations in applying diffusion models to inverse problems where acquiring forward operators is costly, enabling new capabilities in non-linear 3D inverse rendering.

The paper tackles blind inverse problems where the forward operator is unknown, introducing LatentDEM which uses latent diffusion priors in an Expectation-Maximization framework to recover clean images and estimate forward operators, demonstrating superior performance on 2D blind deblurring and 3D sparse-view reconstruction tasks.

Diffusion models have emerged as powerful tools for solving inverse problems due to their exceptional ability to model complex prior distributions. However, existing methods predominantly assume known forward operators (i.e., non-blind), limiting their applicability in practical settings where acquiring such operators is costly. Additionally, many current approaches rely on pixel-space diffusion models, leaving the potential of more powerful latent diffusion models (LDMs) underexplored. In this paper, we introduce LatentDEM, an innovative technique that addresses more challenging blind inverse problems using latent diffusion priors. At the core of our method is solving blind inverse problems within an iterative Expectation-Maximization (EM) framework: (1) the E-step recovers clean images from corrupted observations using LDM priors and a known forward model, and (2) the M-step estimates the forward operator based on the recovered images. Additionally, we propose two novel optimization techniques tailored for LDM priors and EM frameworks, yielding more accurate and efficient blind inversion results. As a general framework, LatentDEM supports both linear and non-linear inverse problems. Beyond common 2D image restoration tasks, it enables new capabilities in non-linear 3D inverse rendering problems. We validate LatentDEM's performance on representative 2D blind deblurring and 3D sparse-view reconstruction tasks, demonstrating its superior efficacy over prior arts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes