CVSep 12, 2016

Detecting Text in Natural Image with Connectionist Text Proposal Network

arXiv:1609.03605v11000 citations
Originality Highly original
AI Analysis

It improves text detection accuracy and efficiency for applications like image analysis, though it is an incremental advance over prior methods.

The paper tackles text detection in natural images by proposing a Connectionist Text Proposal Network (CTPN) that localizes text lines using fine-scale proposals and a recurrent neural network, achieving 0.88 and 0.61 F-measure on ICDAR benchmarks with 0.14s/image processing time.

We propose a novel Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image. The CTPN detects a text line in a sequence of fine-scale text proposals directly in convolutional feature maps. We develop a vertical anchor mechanism that jointly predicts location and text/non-text score of each fixed-width proposal, considerably improving localization accuracy. The sequential proposals are naturally connected by a recurrent neural network, which is seamlessly incorporated into the convolutional network, resulting in an end-to-end trainable model. This allows the CTPN to explore rich context information of image, making it powerful to detect extremely ambiguous text. The CTPN works reliably on multi-scale and multi- language text without further post-processing, departing from previous bottom-up methods requiring multi-step post-processing. It achieves 0.88 and 0.61 F-measure on the ICDAR 2013 and 2015 benchmarks, surpass- ing recent results [8, 35] by a large margin. The CTPN is computationally efficient with 0:14s/image, by using the very deep VGG16 model [27]. Online demo is available at: http://textdet.com/.

Code Implementations27 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes