Modified Segmentation Algorithm for Recognition of Older Geez Scripts Written on Vellum
This work addresses the challenge of digitizing historical handwritten documents, specifically older Geez scripts on vellum, which is an incremental improvement in a domain-specific area.
The researchers tackled the problem of recognizing older Geez scripts written on vellum by introducing a modified segmentation algorithm, achieving a recognition accuracy of 79.32% using an SVM multiclass classifier.
Recognition of handwritten document aims at transforming document images into a machine understandable format. Handwritten document recognition is the most challenging area in the field of pattern recognition. It becomes more complex when a document was written on vellum before hundreds of years, like older Geez scripts. In this study, we introduced a modified segmentation approach to recognize older Geez scripts. We used adaptive filtering for noise reduction, Isodata iterative global thresholding for document image binarization, modified bounding box projection to segment distinct strokes between Geez characters, numbers, and punctuation marks. SVM multiclass classifier scored 79.32% recognition accuracy with the modified segmentation algorithm.