CVJul 28, 2014

A Fast Hierarchical Method for Multi-script and Arbitrary Oriented Scene Text Extraction

arXiv:1407.7504v140 citations
Originality Highly original
AI Analysis

This addresses the problem of detecting text in natural scenes for applications like document analysis and image understanding, offering a novel hierarchical approach that improves accuracy in unconstrained scenarios.

The paper tackles scene text segmentation by leveraging the hierarchical structure of text through an agglomerative clustering method, achieving state-of-the-art performance on four standard datasets with variable orientations and languages.

Typography and layout lead to the hierarchical organisation of text in words, text lines, paragraphs. This inherent structure is a key property of text in any script and language, which has nonetheless been minimally leveraged by existing text detection methods. This paper addresses the problem of text segmentation in natural scenes from a hierarchical perspective. Contrary to existing methods, we make explicit use of text structure, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypotheses with high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Results obtained over four standard datasets, covering text in variable orientations and different languages, demonstrate that our algorithm, while being trained in a single mixed dataset, outperforms state of the art methods in unconstrained scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes