An Automatic Reader of Identity Documents
This addresses the inefficiency of manual document reading in the service industry, though it is incremental as it builds on existing computer vision techniques for a specific domain.
The paper tackles the problem of manually reading and verifying identity documents by presenting a prototype system that automatically extracts data from Italian identity documents using photographs. The system localizes, classifies, and recognizes text, achieving performance evaluated on a synthetic dataset to avoid privacy issues.
Identity documents automatic reading and verification is an appealing technology for nowadays service industry, since this task is still mostly performed manually, leading to waste of economic and time resources. In this paper the prototype of a novel automatic reading system of identity documents is presented. The system has been thought to extract data of the main Italian identity documents from photographs of acceptable quality, like those usually required to online subscribers of various services. The document is first localized inside the photo, and then classified; finally, text recognition is executed. A synthetic dataset has been used, both for neural networks training, and for performance evaluation of the system. The synthetic dataset avoided privacy issues linked to the use of real photos of real documents, which will be used, instead, for future developments of the system.