CVSep 8, 2021

Which and Where to Focus: A Simple yet Accurate Framework for Arbitrary-Shaped Nearby Text Detection in Scene Images

Youhui Guo, Yu Zhou, Xugong Qin, Weiping Wang

arXiv:2109.03451v16.55 citations

Originality Incremental advance

AI Analysis

This work addresses a specific challenge in scene text detection for computer vision applications, representing an incremental improvement over existing methods.

The paper tackles the problem of detecting arbitrary-shaped nearby text in scene images, where previous methods struggle with confusion between adjacent text instances. The proposed method achieves state-of-the-art or competitive performance on multiple benchmarks by introducing a One-to-Many Training Scheme and a Proposal Feature Attention Module.

Scene text detection has drawn the close attention of researchers. Though many methods have been proposed for horizontal and oriented texts, previous methods may not perform well when dealing with arbitrary-shaped texts such as curved texts. In particular, confusion problem arises in the case of nearby text instances. In this paper, we propose a simple yet effective method for accurate arbitrary-shaped nearby scene text detection. Firstly, a One-to-Many Training Scheme (OMTS) is designed to eliminate confusion and enable the proposals to learn more appropriate groundtruths in the case of nearby text instances. Secondly, we propose a Proposal Feature Attention Module (PFAM) to exploit more effective features for each proposal, which can better adapt to arbitrary-shaped text instances. Finally, we propose a baseline that is based on Faster R-CNN and outputs the curve representation directly. Equipped with PFAM and OMTS, the detector can achieve state-of-the-art or competitive performance on several challenging benchmarks.

View on arXiv PDF

Similar