CVLGMar 17, 2022

Unified Line and Paragraph Detection by Graph Convolutional Networks

arXiv:2203.09638v18 citationsh-index: 16
Originality Incremental advance
AI Analysis

This addresses document layout analysis for applications like OCR and digitization, but is incremental as it builds on existing graph-based methods.

The paper tackles the problem of detecting lines and paragraphs in documents by formulating it as a unified two-level clustering problem, achieving state-of-the-art quality for paragraph detection in benchmarks and real-world images with high efficiency.

We formulate the task of detecting lines and paragraphs in a document into a unified two-level clustering problem. Given a set of text detection boxes that roughly correspond to words, a text line is a cluster of boxes and a paragraph is a cluster of lines. These clusters form a two-level tree that represents a major part of the layout of a document. We use a graph convolutional network to predict the relations between text detection boxes and then build both levels of clusters from these predictions. Experimentally, we demonstrate that the unified approach can be highly efficient while still achieving state-of-the-art quality for detecting paragraphs in public benchmarks and real-world images.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes