CVApr 16, 2020

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

arXiv:2004.07464v3156 citationsHas Code
AI Analysis

This addresses the problem of extracting key information from complex documents for applications in OCR and document processing, representing an incremental advancement in integrating visual and textual features.

The paper tackles the challenge of Key Information Extraction (KIE) from documents by developing PICK, a framework that combines graph learning with graph convolution to better utilize both textual and visual features, resulting in significant performance improvements over baseline methods on real-world datasets.

Computer vision with state-of-the-art deep learning models has achieved huge success in the field of Optical Character Recognition (OCR) including text detection and recognition tasks recently. However, Key Information Extraction (KIE) from documents as the downstream task of OCR, having a large number of use scenarios in real-world, remains a challenge because documents not only have textual features extracting from OCR systems but also have semantic visual features that are not fully exploited and play a critical role in KIE. Too little work has been devoted to efficiently make full use of both textual and visual features of the documents. In this paper, we introduce PICK, a framework that is effective and robust in handling complex documents layout for KIE by combining graph learning with graph convolution operation, yielding a richer semantic representation containing the textual and visual features and global layout without ambiguity. Extensive experiments on real-world datasets have been conducted to show that our method outperforms baselines methods by significant margins. Our code is available at https://github.com/wenwenyu/PICK-pytorch.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes