CVJan 23, 2020

Text Extraction and Restoration of Old Handwritten Documents

arXiv:2001.08742v110 citations
Originality Incremental advance
AI Analysis

This work addresses digital heritage preservation by enabling restoration of old handwritten documents, with potential extension to printed documents, though it is incremental in improving existing methods.

The paper tackles the problem of restoring old degraded handwritten documents by proposing two deep neural network methods for text extraction and background reconstruction, achieving good performance on severely degraded images even with a small dataset of 26 heritage letters.

Image restoration is very crucial computer vision task. This paper describes two novel methods for the restoration of old degraded handwritten documents using deep neural network. In addition to that, a small-scale dataset of 26 heritage letters images is introduced. The ground truth data to train the desired network is generated semi automatically involving a pragmatic combination of color transformation, Gaussian mixture model based segmentation and shape correction by using mathematical morphological operators. In the first approach, a deep neural network has been used for text extraction from the document image and later background reconstruction has been done using Gaussian mixture modeling. But Gaussian mixture modelling requires to set parameters manually, to alleviate this we propose a second approach where the background reconstruction and foreground extraction (which which includes extracting text with its original colour) both has been done using deep neural network. Experiments demonstrate that the proposed systems perform well on handwritten document images with severe degradations, even when trained with small dataset. Hence, the proposed methods are ideally suited for digital heritage preservation repositories. It is worth mentioning that, these methods can be extended easily for printed degraded documents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes