CVSep 8, 2021

Which and Where to Focus: A Simple yet Accurate Framework for Arbitrary-Shaped Nearby Text Detection in Scene Images

arXiv:2109.03451v15 citations
Originality Incremental advance
AI Analysis

This work addresses a specific challenge in scene text detection for computer vision applications, representing an incremental improvement over existing methods.

The paper tackles the problem of detecting arbitrary-shaped nearby text in scene images, where previous methods struggle with confusion between adjacent text instances. The proposed method achieves state-of-the-art or competitive performance on multiple benchmarks by introducing a One-to-Many Training Scheme and a Proposal Feature Attention Module.

Scene text detection has drawn the close attention of researchers. Though many methods have been proposed for horizontal and oriented texts, previous methods may not perform well when dealing with arbitrary-shaped texts such as curved texts. In particular, confusion problem arises in the case of nearby text instances. In this paper, we propose a simple yet effective method for accurate arbitrary-shaped nearby scene text detection. Firstly, a One-to-Many Training Scheme (OMTS) is designed to eliminate confusion and enable the proposals to learn more appropriate groundtruths in the case of nearby text instances. Secondly, we propose a Proposal Feature Attention Module (PFAM) to exploit more effective features for each proposal, which can better adapt to arbitrary-shaped text instances. Finally, we propose a baseline that is based on Faster R-CNN and outputs the curve representation directly. Equipped with PFAM and OMTS, the detector can achieve state-of-the-art or competitive performance on several challenging benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes