Evaluating the Impact of Khmer Font Types on Text Recognition
This addresses font selection challenges for OCR systems in Khmer script, but it is incremental as it applies an existing method to new data.
The study evaluated the impact of 19 Khmer font types on text recognition accuracy using Pytesseract, finding that fonts like Khmer and Odor MeanChey achieved high accuracy while iSeth First and Bayon performed poorly.
Text recognition is significantly influenced by font types, especially for complex scripts like Khmer. The variety of Khmer fonts, each with its unique character structure, presents challenges for optical character recognition (OCR) systems. In this study, we evaluate the impact of 19 randomly selected Khmer font types on text recognition accuracy using Pytesseract. The fonts include Angkor, Battambang, Bayon, Bokor, Chenla, Dangrek, Freehand, Kh Kompong Chhnang, Kh SN Kampongsom, Khmer, Khmer CN Stueng Songke, Khmer Savuth Pen, Metal, Moul, Odor MeanChey, Preah Vihear, Siemreap, Sithi Manuss, and iSeth First. Our comparison of OCR performance across these fonts reveals that Khmer, Odor MeanChey, Siemreap, Sithi Manuss, and Battambang achieve high accuracy, while iSeth First, Bayon, and Dangrek perform poorly. This study underscores the critical importance of font selection in optimizing Khmer text recognition and provides valuable insights for developing more robust OCR systems.