CVJul 19, 2023

LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network

arXiv:2307.09815v318 citationsh-index: 26
Originality Incremental advance
AI Analysis

This addresses image deblurring for photography and computational imaging applications, representing an incremental advance by integrating language models into an existing task.

The paper tackles the problem of recovering sharp images from dual-pixel pairs with disparity-dependent blur by proposing a framework that uses CLIP to estimate blur maps unsupervisedly, achieving state-of-the-art performance in experiments.

Recovering sharp images from dual-pixel (DP) pairs with disparity-dependent blur is a challenging task.~Existing blur map-based deblurring methods have demonstrated promising results. In this paper, we propose, to the best of our knowledge, the first framework that introduces the contrastive language-image pre-training framework (CLIP) to accurately estimate the blur map from a DP pair unsupervisedly. To achieve this, we first carefully design text prompts to enable CLIP to understand blur-related geometric prior knowledge from the DP pair. Then, we propose a format to input a stereo DP pair to CLIP without any fine-tuning, despite the fact that CLIP is pre-trained on monocular images. Given the estimated blur map, we introduce a blur-prior attention block, a blur-weighting loss, and a blur-aware loss to recover the all-in-focus image. Our method achieves state-of-the-art performance in extensive experiments (see Fig.~\ref{fig:teaser}).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes