CVDLFeb 27, 2018

Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks

arXiv:1802.10033v14 citations
Originality Incremental advance
AI Analysis

This work addresses OCR accuracy for historical document digitization, representing an incremental improvement over existing methods.

The paper tackles the problem of improving OCR accuracy on early printed books by combining convolutional and LSTM networks, reducing error by up to 44% and achieving character error rates below 0.5%.

This paper proposes a combination of a convolutional and a LSTM network to improve the accuracy of OCR on early printed books. While the standard model of line based OCR uses a single LSTM layer, we utilize a CNN- and Pooling-Layer combination in advance of an LSTM layer. Due to the higher amount of trainable parameters the performance of the network relies on a high amount of training examples to unleash its power. Hereby, the error is reduced by a factor of up to 44%, yielding a CER of 1% and below. To further improve the results we use a voting mechanism to achieve character error rates (CER) below $0.5%$. The runtime of the deep model for training and prediction of a book behaves very similar to a shallow network.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes