CVApr 1, 2016

PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents

arXiv:1604.00187v3236 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient and accurate word retrieval in historical or scanned documents for researchers and archivists, representing an incremental improvement over existing methods.

The paper tackles word spotting in handwritten documents by proposing a CNN architecture trained with the PHOC representation, achieving state-of-the-art results on various benchmarks with improved training and test times.

In recent years, deep convolutional neural networks have achieved state of the art performance in various computer vision task such as classification, detection or segmentation. Due to their outstanding performance, CNNs are more and more used in the field of document image analysis as well. In this work, we present a CNN architecture that is trained with the recently proposed PHOC representation. We show empirically that our CNN architecture is able to outperform state of the art results for various word spotting benchmarks while exhibiting short training and test times.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes