Gundram Leifert

h-index7

8papers

67citations

Novelty22%

AI Score16

Ranked #191,256 of 194,257 authors (top 98%)#58,750 in CV (top 99%)

8 Papers

2.6CVAug 26, 2019

End-To-End Measure for Text Recognition

Gundram Leifert, Roger Labahn, Tobias Grüning et al.

Measuring the performance of text recognition and text line detection engines is an important step to objectively compare systems and their configuration. There exist well-established measures for both tasks separately. However, there is no sophisticated evaluation scheme to measure the quality of a combined text line detection and text recognition system. The F-measure on word level is a well-known methodology, which is sometimes used in this context. Nevertheless, it does not take into account the alignment of hypothesis and ground truth text and can lead to deceptive results. Since users of automatic information retrieval pipelines in the context of text recognition are mainly interested in the end-to-end performance of a given system, there is a strong need for such a measure. Hence, we present a measure to evaluate the quality of an end-to-end text recognition system. The basis for this measure is the well established and widely used character error rate, which is limited -- in its original form -- to aligned hypothesis and ground truth texts. The proposed measure is flexible in a way that it can be configured to penalize different reading orders between the hypothesis and ground truth and can take into account the geometric position of the text lines. Additionally, it can ignore over- and under- segmentation of text lines. With these parameters it is possible to get a measure fitting best to its own needs.

3.3CVJul 17, 2018

Bench-Marking Information Extraction in Semi-Structured Historical Handwritten Records

Animesh Prasad, Hervé Déjean, Jean-Luc Meunier et al.

In this report, we present our findings from benchmarking experiments for information extraction on historical handwritten marriage records Esposalles from IEHHR - ICDAR 2017 robust reading competition. The information extraction is modeled as semantic labeling of the sequence across 2 set of labels. This can be achieved by sequentially or jointly applying handwritten text recognition (HTR) and named entity recognition (NER). We deploy a pipeline approach where first we use state-of-the-art HTR and use its output as input for NER. We show that given low resource setup and simple structure of the records, high performance of HTR ensures overall high performance. We explore the various configurations of conditional random fields and neural networks to benchmark NER on given certain noisy input. The best model on 10-fold cross-validation as well as blind test data uses n-gram features with bidirectional long short-term memory.

3.2IRApr 26, 2018

System Description of CITlab's Recognition & Retrieval Engine for ICDAR2017 Competition on Information Extraction in Historical Handwritten Records

Tobias Strauß, Max Weidemann, Johannes Michael et al.

We present a recognition and retrieval system for the ICDAR2017 Competition on Information Extraction in Historical Handwritten Records which successfully infers person names and other data from marriage records. The system extracts information from the line images with a high accuracy and outperforms the baseline. The optical model is based on Neural Networks. To infer the desired information, regular expressions are used to describe the set of feasible words sequences.

1.1CVMay 26, 2016

CITlab ARGUS for historical handwritten documents

Gundram Leifert, Tobias Strauß, Tobias Grüning et al.

We describe CITlab's recognition system for the HTRtS competition attached to the 13. International Conference on Document Analysis and Recognition, ICDAR 2015. The task comprises the recognition of historical handwritten documents. The core algorithms of our system are based on multi-dimensional recurrent neural networks (MDRNN) and connectionist temporal classification (CTC). The software modules behind that as well as the basic utility technologies are essentially powered by PLANET's ARGUS framework for intelligent text recognition and image processing.

5.3NESep 15, 2015

Regular expressions for decoding of neural network outputs

Tobias Strauß, Gundram Leifert, Tobias Grüning et al.

This article proposes a convenient tool for decoding the output of neural networks trained by Connectionist Temporal Classification (CTC) for handwritten text recognition. We use regular expressions to describe the complex structures expected in the writing. The corresponding finite automata are employed to build a decoder. We analyze theoretically which calculations are relevant and which can be avoided. A great speed-up results from an approximation. We conclude that the approximation most likely fails if the regular expression does not match the ground truth which is not harmful for many applications since the low probability will be even underestimated. The proposed decoder is very efficient compared to other decoding methods. The variety of applications reaches from information retrieval to full text recognition. We refer to applications where we integrated the proposed decoder successfully.

3.5CVDec 15, 2014

CITlab ARGUS for Arabic Handwriting

Gundram Leifert, Roger Labahn, Tobias Strauß

In the recent years it turned out that multidimensional recurrent neural networks (MDRNN) perform very well for offline handwriting recognition tasks like the OpenHaRT 2013 evaluation DIR. With suitable writing preprocessing and dictionary lookup, our ARGUS software completed this task with an error rate of 26.27% in its primary setup.

3.5CVDec 15, 2014

CITlab ARGUS for historical data tables

Gundram Leifert, Tobias Grüning, Tobias Strauß et al.

We describe CITlab's recognition system for the ANWRESH-2014 competition attached to the 14. International Conference on Frontiers in Handwriting Recognition, ICFHR 2014. The task comprises word recognition from segmented historical documents. The core components of our system are based on multi-dimensional recurrent neural networks (MDRNN) and connectionist temporal classification (CTC). The software modules behind that as well as the basic utility technologies are essentially powered by PLANET's ARGUS framework for intelligent text recognition and image processing.

17.4AIDec 8, 2014

Cells in Multidimensional Recurrent Neural Networks

G. Leifert, T. Strauß, T. Grüning et al.

The transcription of handwritten text on images is one task in machine learning and one solution to solve it is using multi-dimensional recurrent neural networks (MDRNN) with connectionist temporal classification (CTC). The RNNs can contain special units, the long short-term memory (LSTM) cells. They are able to learn long term dependencies but they get unstable when the dimension is chosen greater than one. We defined some useful and necessary properties for the one-dimensional LSTM cell and extend them in the multi-dimensional case. Thereby we introduce several new cells with better stability. We present a method to design cells using the theory of linear shift invariant systems. The new cells are compared to the LSTM cell on the IFN/ENIT and Rimes database, where we can improve the recognition rate compared to the LSTM cell. So each application where the LSTM cells in MDRNNs are used could be improved by substituting them by the new developed cells.