AIJul 20, 2023

PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts

arXiv:2307.10551v15 citationsh-index: 21
Originality Incremental advance
AI Analysis

This addresses the problem of extracting structured information from visually rich documents for applications like document processing, though it is incremental as it builds on existing KIE methods.

The paper tackles key information extraction from complex document layouts by introducing a new dataset (CLEX) with 5,860 images and 1,162 entity categories, and proposes an end-to-end model (PPN) that outperforms state-of-the-art methods with faster inference speed.

Key Information Extraction (KIE) is a challenging multimodal task that aims to extract structured value semantic entities from visually rich documents. Although significant progress has been made, there are still two major challenges that need to be addressed. Firstly, the layout of existing datasets is relatively fixed and limited in the number of semantic entity categories, creating a significant gap between these datasets and the complex real-world scenarios. Secondly, existing methods follow a two-stage pipeline strategy, which may lead to the error propagation problem. Additionally, they are difficult to apply in situations where unseen semantic entity categories emerge. To address the first challenge, we propose a new large-scale human-annotated dataset named Complex Layout form for key information EXtraction (CLEX), which consists of 5,860 images with 1,162 semantic entity categories. To solve the second challenge, we introduce Parallel Pointer-based Network (PPN), an end-to-end model that can be applied in zero-shot and few-shot scenarios. PPN leverages the implicit clues between semantic entities to assist extracting, and its parallel extraction mechanism allows it to extract multiple results simultaneously and efficiently. Experiments on the CLEX dataset demonstrate that PPN outperforms existing state-of-the-art methods while also offering a much faster inference speed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes