CV LGJun 8, 2025

TABLET: Table Structure Recognition using Encoder-only Transformers

arXiv:2506.07015v18.45 citationsh-index: 2ICDAR

Originality Highly original

AI Analysis

This provides a robust and scalable solution for large-scale table recognition, particularly beneficial for industrial applications dealing with densely populated tables.

The paper tackles the problem of table structure recognition by proposing a Split-Merge-based top-down model that uses encoder-only Transformers for row/column splitting and merging, achieving high accuracy and fast processing speed on datasets like FinTabNet and PubTabNet.

To address the challenges of table structure recognition, we propose a novel Split-Merge-based top-down model optimized for large, densely populated tables. Our approach formulates row and column splitting as sequence labeling tasks, utilizing dual Transformer encoders to capture feature interactions. The merging process is framed as a grid cell classification task, leveraging an additional Transformer encoder to ensure accurate and coherent merging. By eliminating unstable bounding box predictions, our method reduces resolution loss and computational complexity, achieving high accuracy while maintaining fast processing speed. Extensive experiments on FinTabNet and PubTabNet demonstrate the superiority of our model over existing approaches, particularly in real-world applications. Our method offers a robust, scalable, and efficient solution for large-scale table recognition, making it well-suited for industrial deployment.

View on arXiv PDF

Similar