CVOct 9, 2019

MIDV-2019: Challenges of the modern mobile-based document OCR

arXiv:1910.04009v161 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of dataset limitations for researchers in computer vision focusing on mobile document OCR, but it is incremental as it builds upon the existing MIDV-500 dataset.

The authors tackled the scarcity of datasets for mobile-based document OCR by introducing the MIDV-2019 dataset, which includes video clips with strong projective distortions and low lighting conditions, and provided experimental baselines for text field recognition.

Recognition of identity documents using mobile devices has become a topic of a wide range of computer vision research. The portfolio of methods and algorithms for solving such tasks as face detection, document detection and rectification, text field recognition, and other, is growing, and the scarcity of datasets has become an important issue. One of the openly accessible datasets for evaluating such methods is MIDV-500, containing video clips of 50 identity document types in various conditions. However, the variability of capturing conditions in MIDV-500 did not address some of the key issues, mainly significant projective distortions and different lighting conditions. In this paper we present a MIDV-2019 dataset, containing video clips shot with modern high-resolution mobile cameras, with strong projective distortions and with low lighting conditions. The description of the added data is presented, and experimental baselines for text field recognition in different conditions. The dataset is available for download at ftp://smartengines.com/midv-500/extra/midv-2019/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes