An Augmentation Strategy for Visually Rich Documents
This addresses the challenge of automating document extraction for business workflows with limited data, though it is incremental as it builds on existing augmentation methods.
The paper tackles the problem of extracting fields from visually rich documents when training data is scarce, proposing a data augmentation technique called FieldSwap that swaps key phrases to generate synthetic examples, resulting in 1-7 F1 point improvements in extraction performance.
Many business workflows require extracting important fields from form-like documents (e.g. bank statements, bills of lading, purchase orders, etc.). Recent techniques for automating this task work well only when trained with large datasets. In this work we propose a novel data augmentation technique to improve performance when training data is scarce, e.g. 10-250 documents. Our technique, which we call FieldSwap, works by swapping out the key phrases of a source field with the key phrases of a target field to generate new synthetic examples of the target field for use in training. We demonstrate that this approach can yield 1-7 F1 point improvements in extraction performance.