CVApr 17, 2020

Image Processing Based Scene-Text Detection and Recognition with Tesseract

Ebin Zacharias, Martin Teuchler, Bénédicte Bernier

arXiv:2004.08079v12.321 citations

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for specific use cases in automation, such as text recognition from vehicle-mounted cameras.

The paper tackled scene-text detection and recognition in natural images using a camera on a truck, achieving a correct character recognition rate of over 80%.

Text Recognition is one of the challenging tasks of computer vision with considerable practical interest. Optical character recognition (OCR) enables different applications for automation. This project focuses on word detection and recognition in natural images. In comparison to reading text in scanned documents, the targeted problem is significantly more challenging. The use case in focus facilitates the possibility to detect the text area in natural scenes with greater accuracy because of the availability of images under constraints. This is achieved using a camera mounted on a truck capturing likewise images round-the-clock. The detected text area is then recognized using Tesseract OCR engine. Even though it benefits low computational power requirements, the model is limited to only specific use cases. This paper discusses a critical false positive case scenario occurred while testing and elaborates the strategy used to alleviate the problem. The project achieved a correct character recognition rate of more than 80\%. This paper outlines the stages of development, the major challenges and some of the interesting findings of the project.

View on arXiv PDF

Similar