CVAIJun 19, 2021

TNCR: Table Net Detection and Classification Dataset

arXiv:2106.15322v141 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This provides a new dataset for researchers working on table detection and classification in document images, but it is incremental as it builds on existing methods.

The authors introduced TNCR, a dataset of 9428 labeled images for table detection and classification in scanned documents, and established baselines using deep learning methods, with Cascade Mask R-CNN achieving an F1 score of 84.4%.

We present TNCR, a new table dataset with varying image quality collected from free websites. The TNCR dataset can be used for table detection in scanned document images and their classification into 5 different classes. TNCR contains 9428 high-quality labeled images. In this paper, we have implemented state-of-the-art deep learning-based methods for table detection to create several strong baselines. Cascade Mask R-CNN with ResNeXt-101-64x4d Backbone Network achieves the highest performance compared to other methods with a precision of 79.7%, recall of 89.8%, and f1 score of 84.4% on the TNCR dataset. We have made TNCR open source in the hope of encouraging more deep learning approaches to table detection, classification, and structure recognition. The dataset and trained model checkpoints are available at https://github.com/abdoelsayed2016/TNCR_Dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes