CVOct 23, 2018

Visual Semantic Re-ranker for Text Spotting

arXiv:1810.09776v2
Originality Incremental advance
AI Analysis

This work improves text-spotting accuracy for applications like document analysis or scene understanding, but it is incremental as it builds on existing methods with a complementary approach.

The paper tackles the problem of text recognition by addressing the lack of semantic context in existing methods, proposing a post-processing re-ranker that uses visual-textual relations to improve accuracy, resulting in boosted performance with low computational cost on the ICDAR'17 dataset.

Many current state-of-the-art methods for text recognition are based on purely local information and ignore the semantic correlation between text and its surrounding visual context. In this paper, we propose a post-processing approach to improve the accuracy of text spotting by using the semantic relation between the text and the scene. We initially rely on an off-the-shelf deep neural network that provides a series of text hypotheses for each input image. These text hypotheses are then re-ranked using the semantic relatedness with the object in the image. As a result of this combination, the performance of the original network is boosted with a very low computational cost. The proposed framework can be used as a drop-in complement for any text-spotting algorithm that outputs a ranking of word hypotheses. We validate our approach on ICDAR'17 shared task dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes