CVDec 9, 2019

Modular Multimodal Architecture for Document Classification

Tyler Dauphinee, Nikunj Patel, Mohammad Rashidi

arXiv:1912.04376v130 citations

Originality Incremental advance

AI Analysis

This improves document analysis systems by enabling better branching control flows for different document components.

The paper tackles document page classification by using both visual and textual content, achieving a state-of-the-art result of 93.03% test accuracy on the RVL-CDIP benchmark.

Page classification is a crucial component to any document analysis system, allowing for complex branching control flows for different components of a given document. Utilizing both the visual and textual content of a page, the proposed method exceeds the current state-of-the-art performance on the RVL-CDIP benchmark at 93.03% test accuracy.

View on arXiv PDF

Similar