CVSep 6, 2021

Parsing Table Structures in the Wild

arXiv:2109.02199v179 citations
Originality Incremental advance
AI Analysis

It addresses a practical problem for real-world applications like document digitization by handling challenging table images, though it is incremental as it builds on existing methods like CenterNet.

This paper tackles table structure parsing from images with severe deformations, occlusions, or bending, proposing Cycle-CenterNet with a cycle-pairing module and a new dataset WTW, achieving a 24.6% absolute improvement in accuracy on WTW using the TEDS metric.

This paper tackles the problem of table structure parsing (TSP) from images in the wild. In contrast to existing studies that mainly focus on parsing well-aligned tabular images with simple layouts from scanned PDF documents, we aim to establish a practical table structure parsing system for real-world scenarios where tabular input images are taken or scanned with severe deformation, bending or occlusions. For designing such a system, we propose an approach named Cycle-CenterNet on the top of CenterNet with a novel cycle-pairing module to simultaneously detect and group tabular cells into structured tables. In the cycle-pairing module, a new pairing loss function is proposed for the network training. Alongside with our Cycle-CenterNet, we also present a large-scale dataset, named Wired Table in the Wild (WTW), which includes well-annotated structure parsing of multiple style tables in several scenes like the photo, scanning files, web pages, \emph{etc.}. In experiments, we demonstrate that our Cycle-CenterNet consistently achieves the best accuracy of table structure parsing on the new WTW dataset by 24.6\% absolute improvement evaluated by the TEDS metric. A more comprehensive experimental analysis also validates the advantages of our proposed methods for the TSP task.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes