CVAug 24, 2019

Towards Unconstrained End-to-End Text Spotting

arXiv:1908.09231v1139 citations
AI Analysis

This addresses the challenge of unconstrained text spotting for applications like document analysis and autonomous systems, representing a significant advance over prior methods.

The paper tackles the problem of reading irregularly shaped scene text by proposing an end-to-end trainable network that simultaneously detects and recognizes text without rectification, achieving state-of-the-art improvements of 4.6% on the ICDAR15 benchmark and over 16% on the Total-Text benchmark.

We propose an end-to-end trainable network that can simultaneously detect and recognize text of arbitrary shape, making substantial progress on the open problem of reading scene text of irregular shape. We formulate arbitrary shape text detection as an instance segmentation problem; an attention model is then used to decode the textual content of each irregularly shaped text region without rectification. To extract useful irregularly shaped text instance features from image scale features, we propose a simple yet effective RoI masking step. Additionally, we show that predictions from an existing multi-step OCR engine can be leveraged as partially labeled training data, which leads to significant improvements in both the detection and recognition accuracy of our model. Our method surpasses the state-of-the-art for end-to-end recognition tasks on the ICDAR15 (straight) benchmark by 4.6%, and on the Total-Text (curved) benchmark by more than 16%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes