CVJan 22, 2024

Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis

arXiv:2401.11874v218 citationsh-index: 16Pattern Recognition
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding complex document layouts for applications in information retrieval and knowledge extraction, representing an incremental improvement with a new benchmark.

The paper tackles hierarchical document structure analysis by proposing a tree construction approach that concurrently handles object detection, reading order prediction, and structure construction, achieving state-of-the-art performance on datasets like PubLayNet, DocLayNet, HRDoc, and a new benchmark Comp-HRDoc.

Document structure analysis (aka document layout analysis) is crucial for understanding the physical layout and logical structure of documents, with applications in information retrieval, document summarization, knowledge extraction, etc. In this paper, we concentrate on Hierarchical Document Structure Analysis (HDSA) to explore hierarchical relationships within structured documents created using authoring software employing hierarchical schemas, such as LaTeX, Microsoft Word, and HTML. To comprehensively analyze hierarchical document structures, we propose a tree construction based approach that addresses multiple subtasks concurrently, including page object detection (Detect), reading order prediction of identified objects (Order), and the construction of intended hierarchical structure (Construct). We present an effective end-to-end solution based on this framework to demonstrate its performance. To assess our approach, we develop a comprehensive benchmark called Comp-HRDoc, which evaluates the above subtasks simultaneously. Our end-to-end system achieves state-of-the-art performance on two large-scale document layout analysis datasets (PubLayNet and DocLayNet), a high-quality hierarchical document structure reconstruction dataset (HRDoc), and our Comp-HRDoc benchmark. The Comp-HRDoc benchmark will be released to facilitate further research in this field.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes