CVApr 24, 2018

An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches

arXiv:1804.09003v1131 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of complex anchor design in text detection for computer vision applications, offering an incremental improvement over existing methods.

The paper tackles the inefficiency of anchor-based region proposal networks in scene text detection by proposing an anchor-free region proposal network (AF-RPN) within the Faster R-CNN framework, achieving state-of-the-art results on multiple benchmarks with higher recall rates on the COCO-Text dataset.

The anchor mechanism of Faster R-CNN and SSD framework is considered not effective enough to scene text detection, which can be attributed to its IoU based matching criterion between anchors and ground-truth boxes. In order to better enclose scene text instances of various shapes, it requires to design anchors of various scales, aspect ratios and even orientations manually, which makes anchor-based methods sophisticated and inefficient. In this paper, we propose a novel anchor-free region proposal network (AF-RPN) to replace the original anchor-based RPN in the Faster R-CNN framework to address the above problem. Compared with a vanilla RPN and FPN-RPN, AF-RPN can get rid of complicated anchor design and achieve higher recall rate on large-scale COCO-Text dataset. Owing to the high-quality text proposals, our Faster R-CNN based two-stage text detection approach achieves state-of-the-art results on ICDAR-2017 MLT, ICDAR-2015 and ICDAR-2013 text detection benchmarks when using single-scale and single-model (ResNet50) testing only.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes