NEApr 3, 2019

Extracting Tables from Documents using Conditional Generative Adversarial Networks and Genetic Algorithms

arXiv:1904.01947v112 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of table extraction for industries and academic research, presenting an incremental improvement by incorporating prior structural information.

The paper tackles the problem of extracting tables from documents by proposing a top-down method that first generates a skeleton table structure using a generative adversarial network and then optimizes candidate table structures with a genetic algorithm, achieving improved accuracy over bottom-up approaches.

Extracting information from tables in documents presents a significant challenge in many industries and in academic research. Existing methods which take a bottom-up approach of integrating lines into cells and rows or columns neglect the available prior information relating to table structure. Our proposed method takes a top-down approach, first using a generative adversarial network to map a table image into a standardised `skeleton' table form denoting the approximate row and column borders without table content, then fitting renderings of candidate latent table structures to the skeleton structure using a distance measure optimised by a genetic algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes