CVLGMar 16, 2025

LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization

arXiv:2503.12615v215 citationsh-index: 4Has Code
Originality Highly original
AI Analysis

This work addresses computational inefficiency and prompt calibration issues in inverse imaging problems for researchers and practitioners in computer vision and medical imaging, representing a novel method rather than an incremental improvement.

The paper tackles the challenge of using text-to-image latent diffusion models for zero-shot inverse problems by proposing LATINO, a Plug & Play framework that leverages Latent Consistency Models for efficient inference, achieving state-of-the-art quality in as few as 8 neural function evaluations and significantly improving computational efficiency.

Text-to-image latent diffusion models (LDMs) have recently emerged as powerful generative models with great potential for solving inverse problems in imaging. However, leveraging such models in a Plug & Play (PnP), zero-shot manner remains challenging because it requires identifying a suitable text prompt for the unknown image of interest. Also, existing text-to-image PnP approaches are highly computationally expensive. We herein address these challenges by proposing a novel PnP inference paradigm specifically designed for embedding generative models within stochastic inverse solvers, with special attention to Latent Consistency Models (LCMs), which distill LDMs into fast generators. We leverage our framework to propose LAtent consisTency INverse sOlver (LATINO), the first zero-shot PnP framework to solve inverse problems with priors encoded by LCMs. Our conditioning mechanism avoids automatic differentiation and reaches SOTA quality in as little as 8 neural function evaluations. As a result, LATINO delivers remarkably accurate solutions and is significantly more memory and computationally efficient than previous approaches. We then embed LATINO within an empirical Bayesian framework that automatically calibrates the text prompt from the observed measurements by marginal maximum likelihood estimation. Extensive experiments show that prompt self-calibration greatly improves estimation, allowing LATINO with PRompt Optimization to define new SOTAs in image reconstruction quality and computational efficiency. The code is available at https://latino-pro.github.io

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes