CVOct 22, 2013

Word Spotting in Cursive Handwritten Documents using Modified Character Shape Codes

arXiv:1310.6063v18 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of querying historical and scientific handwritten documents for archivists and researchers, but it is incremental as it builds on existing word spotting methods with a two-level selection approach.

The paper tackles the problem of indexing and searching handwritten English documents by applying a word spotting technique using Modified Character Shape Codes, resulting in a faster and more efficient process that reduces pre-processing needs.

There is a large collection of Handwritten English paper documents of Historical and Scientific importance. But paper documents are not recognized directly by computer. Hence the closest way of indexing these documents is by storing their document digital image. Hence a large database of document images can replace the paper documents. But the document and data corresponding to each image cannot be directly recognized by the computer. This paper applies the technique of word spotting using Modified Character Shape Code to Handwritten English document images for quick and efficient query search of words on a database of document images. It is different from other Word Spotting techniques as it implements two level of selection for word segments to match search query. First based on word size and then based on character shape code of query. It makes the process faster and more efficient and reduces the need of multiple pre-processing.

View on arXiv PDF

Similar