IRAICLLGAug 13, 2021

Zero-shot Task Transfer for Invoice Extraction via Class-aware QA Ensemble

arXiv:2108.06069v12 citations
Originality Incremental advance
AI Analysis

This addresses the problem of document extraction for enterprises by enabling zero-shot transfer across layouts and domains, though it is incremental as it builds on existing QA methods.

The paper tackles the challenge of extracting information from invoices without labeled data by converting the task into a natural language question-answering problem, achieving an average F1 score of 87.50 on a real-world dataset.

We present VESPA, an intentionally simple yet novel zero-shot system for layout, locale, and domain agnostic document extraction. In spite of the availability of large corpora of documents, the lack of labeled and validated datasets makes it a challenge to discriminatively train document extraction models for enterprises. We show that this problem can be addressed by simply transferring the information extraction (IE) task to a natural language Question-Answering (QA) task without engineering task-specific architectures. We demonstrate the effectiveness of our system by evaluating on a closed corpus of real-world retail and tax invoices with multiple complex layouts, domains, and geographies. The empirical evaluation shows that our system outperforms 4 prominent commercial invoice solutions that use discriminatively trained models with architectures specifically crafted for invoice extraction. We extracted 6 fields with zero upfront human annotation or training with an Avg. F1 of 87.50.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes