CVDec 13, 2023

Diffusion-based Blind Text Image Super-Resolution

arXiv:2312.08886v237 citationsh-index: 12CVPR
Originality Incremental advance
AI Analysis

This work addresses the challenge of high-quality text image super-resolution for real-world applications like document restoration, but it is incremental as it builds on existing diffusion model techniques.

The authors tackled the problem of restoring degraded low-resolution text images, particularly for Chinese text with complex strokes, by proposing a diffusion-based method that combines image and text diffusion models to improve both text structure accuracy and realistic appearance, achieving superior results on synthetic and real-world datasets.

Recovering degraded low-resolution text images is challenging, especially for Chinese text images with complex strokes and severe degradation in real-world scenarios. Ensuring both text fidelity and style realness is crucial for high-quality text image super-resolution. Recently, diffusion models have achieved great success in natural image synthesis and restoration due to their powerful data distribution modeling abilities and data generation capabilities. In this work, we propose an Image Diffusion Model (IDM) to restore text images with realistic styles. For diffusion models, they are not only suitable for modeling realistic image distribution but also appropriate for learning text distribution. Since text prior is important to guarantee the correctness of the restored text structure according to existing arts, we also propose a Text Diffusion Model (TDM) for text recognition which can guide IDM to generate text images with correct structures. We further propose a Mixture of Multi-modality module (MoM) to make these two diffusion models cooperate with each other in all the diffusion steps. Extensive experiments on synthetic and real-world datasets demonstrate that our Diffusion-based Blind Text Image Super-Resolution (DiffTSR) can restore text images with more accurate text structures as well as more realistic appearances simultaneously.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes