CVMar 18, 2020

Confronting the Constraints for Optical Character Segmentation from Printed Bangla Text Image

Abu Saleh Md. Abir, Sanjana Rahman, Samia Ellin, Maisha Farzana, Md Hridoy Manik, Chowdhury Rafeed Rahman

arXiv:2003.08384v5

Originality Synthesis-oriented

AI Analysis

This addresses the need for accurate digitization of printed Bangla text, which is incremental as it focuses on improving segmentation methods for a specific language.

The paper tackles the problem of segmenting characters from printed Bangla text images, both ideal and non-ideal cases, achieving a sustainable outcome for optical character recognition.

In a world of digitization, optical character recognition holds the automation to written history. Optical character recognition system basically converts printed images into editable texts for better storage and usability. To be completely functional, the system needs to go through some crucial methods such as pre-processing and segmentation. Pre-processing helps printed data to be noise free and gets rid of skewness efficiently whereas segmentation helps the image fragment into line, word and character precisely for better conversion. These steps hold the door to better accuracy and consistent results for a printed image to be ready for conversion. Our proposed algorithm is able to segment characters both from ideal and non-ideal cases of scanned or captured images giving a sustainable outcome. The implementation of our work is provided here: https://cutt.ly/rgdfBIa

View on arXiv PDF

Similar