Amortized Guidance for Image Inpainting with Pretrained Diffusion Models

arXiv:2605.1301039.2

AI Analysis

This work addresses the need for efficient, high-quality image inpainting using pretrained diffusion models, offering a practical middle-ground between task-specific training and per-instance optimization.

The authors introduce Amortized Inpainting with Diffusion (AID), a method that trains a small reusable guidance module for pretrained diffusion models, enabling high-quality image inpainting without per-instance optimization. AID consistently improves quality-speed trade-offs over baselines on AFHQv2, FFHQ, and ImageNet, adding less than 1% trainable overhead.

We study image inpainting with generative diffusion models. Existing methods typically either train dedicated task-specific models, or adapt a pretrained diffusion model separately for each masked image at deployment. We introduce a middle-ground model, termed Amortized Inpainting with Diffusion (AID), which keeps a pretrained diffusion backbone fixed, trains a small reusable guidance module offline, and then reuses it across masked images without per-instance optimization. We formulate it as a deterministic guidance problem with a supervised terminal objective. To make this problem learnable in high dimensions, we derive an auxiliary Gaussian formulation and prove that solving this randomized problem recovers the optimal deterministic guidance field. This bridge yields a principled continuous-time actor--critic algorithm for learning the guidance module in a fully data-driven manner. Empirically, on AFHQv2 and FFHQ under the pixel EDM pipeline and on ImageNet under the latent EDM2 pipeline, AID consistently improves the quality--speed trade-off over strong fixed-backbone and amortized inpainting baselines across multiple mask types, while adding less than one percent trainable overhead.

View on arXiv PDF

Similar