CVApr 22, 2024

Text in the Dark: Extremely Low-Light Text Image Enhancement

arXiv:2404.14135v14 citationsh-index: 5Has CodeSignal Process Commun
Originality Incremental advance
AI Analysis

This addresses a domain-specific problem for computer vision researchers and practitioners working with low-light scene text applications, representing a novel method for a known bottleneck.

The paper tackles the problem of enhancing extremely low-light text images to improve scene text detection and recognition, proposing a novel encoder-decoder framework with edge-aware attention and specialized losses. The method outperforms state-of-the-art approaches on image quality and scene text metrics across LOL, SID, and synthetic IC15 datasets.

Extremely low-light text images are common in natural scenes, making scene text detection and recognition challenging. One solution is to enhance these images using low-light image enhancement methods before text extraction. However, previous methods often do not try to particularly address the significance of low-level features, which are crucial for optimal performance on downstream scene text tasks. Further research is also hindered by the lack of extremely low-light text datasets. To address these limitations, we propose a novel encoder-decoder framework with an edge-aware attention module to focus on scene text regions during enhancement. Our proposed method uses novel text detection and edge reconstruction losses to emphasize low-level scene text features, leading to successful text extraction. Additionally, we present a Supervised Deep Curve Estimation (Supervised-DCE) model to synthesize extremely low-light images based on publicly available scene text datasets such as ICDAR15 (IC15). We also labeled texts in the extremely low-light See In the Dark (SID) and ordinary LOw-Light (LOL) datasets to allow for objective assessment of extremely low-light image enhancement through scene text tasks. Extensive experiments show that our model outperforms state-of-the-art methods in terms of both image quality and scene text metrics on the widely-used LOL, SID, and synthetic IC15 datasets. Code and dataset will be released publicly at https://github.com/chunchet-ng/Text-in-the-Dark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes