CVDec 9, 2019

Modular Multimodal Architecture for Document Classification

arXiv:1912.04376v130 citations
Originality Incremental advance
AI Analysis

This improves document analysis systems by enabling better branching control flows for different document components.

The paper tackles document page classification by using both visual and textual content, achieving a state-of-the-art result of 93.03% test accuracy on the RVL-CDIP benchmark.

Page classification is a crucial component to any document analysis system, allowing for complex branching control flows for different components of a given document. Utilizing both the visual and textual content of a page, the proposed method exceeds the current state-of-the-art performance on the RVL-CDIP benchmark at 93.03% test accuracy.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes