CVMar 14, 2023

Rethinking Image-based Table Recognition Using Weakly Supervised Methods

Nam Tuan Ly, Atsuhiro Takasu, Phuc Nguyen, Hideaki Takeda

arXiv:2303.07641v110.421 citationsh-index: 20Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of expensive annotation for table recognition, making it more accessible for researchers and practitioners, though it is incremental as it builds on existing weakly supervised approaches.

The paper tackles the high cost and subjectivity of detailed table image annotations by proposing WSTabNet, a weakly supervised model that uses only HTML/LaTeX code-level annotations, and it achieves comparable or better accuracy than state-of-the-art models on multiple benchmark datasets.

Most of the previous methods for table recognition rely on training datasets containing many richly annotated table images. Detailed table image annotation, e.g., cell or text bounding box annotation, however, is costly and often subjective. In this paper, we propose a weakly supervised model named WSTabNet for table recognition that relies only on HTML (or LaTeX) code-level annotations of table images. The proposed model consists of three main parts: an encoder for feature extraction, a structure decoder for generating table structure, and a cell decoder for predicting the content of each cell in the table. Our system is trained end-to-end by stochastic gradient descent algorithms, requiring only table images and their ground-truth HTML (or LaTeX) representations. To facilitate table recognition with deep learning, we create and release WikiTableSet, the largest publicly available image-based table recognition dataset built from Wikipedia. WikiTableSet contains nearly 4 million English table images, 590K Japanese table images, and 640k French table images with corresponding HTML representation and cell bounding boxes. The extensive experiments on WikiTableSet and two large-scale datasets: FinTabNet and PubTabNet demonstrate that the proposed weakly supervised model achieves better, or similar accuracies compared to the state-of-the-art models on all benchmark datasets.

View on arXiv PDF Code

Similar