CVNov 21, 2016

TextBoxes: A Fast Text Detector with a Single Deep Neural Network

arXiv:1611.06779v1918 citations
Originality Highly original
AI Analysis

This addresses the problem of fast and accurate text detection in images for applications like word spotting and end-to-end recognition, representing a strong specific gain rather than a foundational breakthrough.

The paper tackled scene text detection by introducing TextBoxes, an end-to-end trainable detector that achieves high accuracy and efficiency, with a fast implementation taking only 0.09s per image and outperforming state-of-the-art methods in localization and recognition tasks.

This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression. TextBoxes outperforms competing methods in terms of text localization accuracy and is much faster, taking only 0.09s per image in a fast implementation. Furthermore, combined with a text recognizer, TextBoxes significantly outperforms state-of-the-art approaches on word spotting and end-to-end text recognition tasks.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes