Key-value information extraction from full handwritten pages
This work addresses the problem of automating key-value extraction from full handwritten pages for document digitization and archival, representing an incremental improvement over existing multi-stage approaches.
The authors tackled information extraction from handwritten documents by proposing a Transformer-based model that integrates feature extraction, handwriting recognition, and named entity recognition into a single system, outperforming previous methods on three public datasets (IAM, ESPOSALLES, and POPP).
We propose a Transformer-based approach for information extraction from digitized handwritten documents. Our approach combines, in a single model, the different steps that were so far performed by separate models: feature extraction, handwriting recognition and named entity recognition. We compare this integrated approach with traditional two-stage methods that perform handwriting recognition before named entity recognition, and present results at different levels: line, paragraph, and page. Our experiments show that attention-based models are especially interesting when applied on full pages, as they do not require any prior segmentation step. Finally, we show that they are able to learn from key-value annotations: a list of important words with their corresponding named entities. We compare our models to state-of-the-art methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform previous performances on all three datasets.