CVAug 12, 2019Code
Self-supervised Data Bootstrapping for Deep Optical Character Recognition of Identity DocumentsOliver Mothes, Joachim Denzler
The essential task of verifying person identities at airports and national borders is very time consuming. To accelerate it, optical character recognition for identity documents (IDs) using dictionaries is not appropriate due to high variability of the text content in IDs, e.g., individual street names or surnames. Additionally, no properties of the used fonts in IDs are known. Therefore, we propose an iterative self-supervised bootstrapping approach using a smart strategy to mine real character data from IDs. In combination with synthetically generated character data, the real data is used to train efficient convolutional neural networks for character classification serving a practical runtime as well as a high accuracy. On a dataset with 74 character classes, we achieve an average class-wise accuracy of 99.4 %. In contrast, if we would apply a classifier trained only using synthetic data, the accuracy is reduced to 58.1 %. Finally, we show that our whole proposed pipeline outperforms an established open-source framework
CVFeb 13, 2024
JeFaPaTo -- A joint toolbox for blinking analysis and facial features extractionTim Büchner, Oliver Mothes, Orlando Guntinas-Lichius et al.
Analyzing facial features and expressions is a complex task in computer vision. The human face is intricate, with significant shape, texture, and appearance variations. In medical contexts, facial structures and movements that differ from the norm are particularly important to study and require precise analysis to understand the underlying conditions. Given that solely the facial muscles, innervated by the facial nerve, are responsible for facial expressions, facial palsy can lead to severe impairments in facial movements. One affected area of interest is the subtle movements involved in blinking. It is an intricate spontaneous process that is not yet fully understood and needs high-resolution, time-specific analysis for detailed understanding. However, a significant challenge is that many computer vision techniques demand programming skills for automated extraction and analysis, making them less accessible to medical professionals who may not have these skills. The Jena Facial Palsy Toolbox (JeFaPaTo) has been developed to bridge this gap. It utilizes cutting-edge computer vision algorithms and offers a user-friendly interface for those without programming expertise. This toolbox makes advanced facial analysis more accessible to medical experts, simplifying integration into their workflow.