CVIRMar 8, 2022

Table Structure Recognition with Conditional Attention

arXiv:2203.03819v114 citationsh-index: 46
Originality Incremental advance
AI Analysis

This addresses the challenge of parsing tables from PDFs and images for downstream tasks like semantic modeling and information retrieval, though it appears incremental as it builds on graph-based representations and attention mechanisms.

The paper tackles the problem of recognizing table structures in unstructured documents by representing tables as graphs and formulating it as a cell association classification problem, achieving Micro-averaged F1 scores up to 0.963 and Macro-averaged F1 scores up to 0.923 with their proposed conditional attention network (CATT-Net).

Tabular data in digital documents is widely used to express compact and important information for readers. However, it is challenging to parse tables from unstructured digital documents, such as PDFs and images, into machine-readable format because of the complexity of table structures and the missing of meta-information. Table Structure Recognition (TSR) problem aims to recognize the structure of a table and transform the unstructured tables into a structured and machine-readable format so that the tabular data can be further analysed by the down-stream tasks, such as semantic modeling and information retrieval. In this study, we hypothesize that a complicated table structure can be represented by a graph whose vertices and edges represent the cells and association between cells, respectively. Then we define the table structure recognition problem as a cell association classification problem and propose a conditional attention network (CATT-Net). The experimental results demonstrate the superiority of our proposed method over the state-of-the-art methods on various datasets. Besides, we investigate whether the alignment of a cell bounding box or a text-focused approach has more impact on the model performance. Due to the lack of public dataset annotations based on these two approaches, we further annotate the ICDAR2013 dataset providing both types of bounding boxes, which can be a new benchmark dataset for evaluating the methods in this field. Experimental results show that the alignment of a cell bounding box can help improve the Micro-averaged F1 score from 0.915 to 0.963, and the Macro-average F1 score from 0.787 to 0.923.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes