CVLGAug 16, 2019

A Comparative Study of Filtering Approaches Applied to Color Archival Document Images

arXiv:1908.09007v10.006 citations
AI Analysis15

This work addresses the specific issue of improving OCR accuracy for archival documents in Tunisian national archives, but it is incremental as it provides a comparative analysis of existing filtering methods rather than introducing a new technique.

The study tackled the problem of poor OCR performance on degraded ancient Arabic documents by comparing four filtering approaches (scalar, marginal, vector, hybrid) for image enhancement, quantifying their performance through numerical experiments on color archival document images.

Current systems used by the Tunisian national archives for the automatic transcription of archival documents are hindered by many issues related to the performance of the optical character recognition (OCR) tools. Indeed, using a classical OCR system to transcribe and index ancient Arabic documents is not a straightforward task due to the idiosyncrasies of this category of documents, such as noise and degradation. Thus, applying an enhancement method or a denoising technique remains an essential prerequisite step to ease the archival document image analysis task. The state-of-the-art methods addressing the use of degraded document image enhancement and denoising are mainly based on applying filters. The most common filtering techniques applied to color images in the literature may be categorized into four approaches: scalar, marginal, vector and hybrid. To provide a set of comprehensive guidelines on the strengths and weaknesses of these filtering approaches, a thorough comparative study is proposed in this article. Numerical experiments are carried out in this study on color archival document images to show and quantify the performance of each assessed filtering approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes