Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks
This work addresses OCR accuracy for historical document digitization, representing an incremental improvement over existing methods.
The paper tackles the problem of improving OCR accuracy on early printed books by combining convolutional and LSTM networks, reducing error by up to 44% and achieving character error rates below 0.5%.
This paper proposes a combination of a convolutional and a LSTM network to improve the accuracy of OCR on early printed books. While the standard model of line based OCR uses a single LSTM layer, we utilize a CNN- and Pooling-Layer combination in advance of an LSTM layer. Due to the higher amount of trainable parameters the performance of the network relies on a high amount of training examples to unleash its power. Hereby, the error is reduced by a factor of up to 44%, yielding a CER of 1% and below. To further improve the results we use a voting mechanism to achieve character error rates (CER) below $0.5%$. The runtime of the deep model for training and prediction of a book behaves very similar to a shallow network.