Understanding Scanned Receipts
This addresses the problem of automating receipt analysis for applications like purchase analytics and expense policy enforcement, but it appears incremental as it builds on existing techniques.
The paper tackled the problem of linking shorthand text from OCR'd receipts to a knowledge base of grocery products using Named Entity Linking, and experiments with Information Retrieval techniques and statistical phrase detection showed promise for effective understanding.
Tasking machines with understanding receipts can have important applications such as enabling detailed analytics on purchases, enforcing expense policies, and inferring patterns of purchase behavior on large collections of receipts. In this paper, we focus on the task of Named Entity Linking (NEL) of scanned receipt line items; specifically, the task entails associating shorthand text from OCR'd receipts with a knowledge base (KB) of grocery products. For example, the scanned item "STO BABY SPINACH" should be linked to the catalog item labeled "Simple Truth Organic Baby Spinach". Experiments that employ a variety of Information Retrieval techniques in combination with statistical phrase detection shows promise for effective understanding of scanned receipt data.