CLCVIRMay 8, 2012

Spectral Analysis of Projection Histogram for Enhancing Close matching character Recognition in Malayalam

arXiv:1205.1639v10.66 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific bottleneck for improving OCR accuracy in printed Malayalam documents, representing an incremental improvement.

The paper tackled the problem of close-matching characters in Malayalam OCR, which limits accuracy beyond 85-95%, by developing a specialized SVM classifier using spectral features from projection histograms to enhance recognition.

The success rates of Optical Character Recognition (OCR) systems for printed Malayalam documents is quite impressive with the state of the art accuracy levels in the range of 85-95% for various. However for real applications, further enhancement of this accuracy levels are required. One of the bottle necks in further enhancement of the accuracy is identified as close-matching characters. In this paper, we delineate the close matching characters in Malayalam and report the development of a specialised classifier for these close-matching characters. The output of a state of the art of OCR is taken and characters falling into the close-matching character set is further fed into this specialised classifier for enhancing the accuracy. The classifier is based on support vector machine algorithm and uses feature vectors derived out of spectral coefficients of projection histogram signals of close-matching characters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes