TableZa -- A classical Computer Vision approach to Tabular Extraction
This addresses the error-prone task of tabular data extraction for document processing applications, but appears incremental as it builds on existing computer vision methods.
The paper tackles the challenging problem of extracting tabular data from images or vector PDFs by proposing a novel computer vision approach, aiming to improve accuracy in document comprehension tasks.
Computer aided Tabular Data Extraction has always been a very challenging and error prone task because it demands both Spectral and Spatial Sanity of data. In this paper we discuss an approach for Tabular Data Extraction in the realm of document comprehension. Given the different kinds of the Tabular formats that are often found across various documents, we discuss a novel approach using Computer Vision for extraction of tabular data from images or vector pdf(s) converted to image(s).