CVJul 5, 2017

R-PHOC: Segmentation-Free Word Spotting using CNN

arXiv:1707.01294v19 citations
Originality Incremental advance
AI Analysis

This work addresses word spotting in document images without requiring segmentation, which is a problem for researchers and practitioners in document analysis, though it is incremental as it builds on existing PHOC embeddings.

The paper tackles segmentation-free word spotting by proposing a region-based CNN that embeds word candidate bounding boxes into an embedding space for nearest neighbor search, improving state-of-the-art on the GW dataset and matching segmentation-based methods in some cases.

This paper proposes a region based convolutional neural network for segmentation-free word spotting. Our net- work takes as input an image and a set of word candidate bound- ing boxes and embeds all bounding boxes into an embedding space, where word spotting can be casted as a simple nearest neighbour search between the query representation and each of the candidate bounding boxes. We make use of PHOC embedding as it has previously achieved significant success in segmentation- based word spotting. Word candidates are generated using a simple procedure based on grouping connected components using some spatial constraints. Experiments show that R-PHOC which operates on images directly can improve the current state-of- the-art in the standard GW dataset and performs as good as PHOCNET in some cases designed for segmentation based word spotting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes