CVNov 23, 2021

StrokeNet: Stroke Assisted and Hierarchical Graph Reasoning Networks

arXiv:2111.11718v11 citations
Originality Highly original
AI Analysis

This addresses the problem of detecting text in complex scenes for computer vision applications, representing an incremental improvement with a novel method for known bottlenecks.

The paper tackles scene text detection by proposing StrokeNet, which directly localizes text strokes and uses hierarchical graph reasoning to handle small, low-resolution, or arbitrarily shaped text, achieving state-of-the-art performance on benchmarks.

Scene text detection is still a challenging task, as there may be extremely small or low-resolution strokes, and close or arbitrary-shaped texts. In this paper, StrokeNet is proposed to effectively detect the texts by capturing the fine-grained strokes, and infer structural relations between the hierarchical representation in the graph. Different from existing approaches that represent the text area by a series of points or rectangular boxes, we directly localize strokes of each text instance through Stroke Assisted Prediction Network (SAPN). Besides, Hierarchical Relation Graph Network (HRGN) is adopted to perform relational reasoning and predict the likelihood of linkages, effectively splitting the close text instances and grouping node classification results into arbitrary-shaped text region. We introduce a novel dataset with stroke-level annotations, namely SynthStroke, for offline pre-training of our model. Experiments on wide-ranging benchmarks verify the State-of-the-Art performance of our method. Our dataset and code will be available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes