CLAIDBFeb 19, 2025

Towards Question Answering over Large Semi-structured Tables

arXiv:2502.13422v21 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses the problem of accurate question answering for users dealing with large web tables, though it is incremental as it builds on existing decomposition methods.

The paper tackles the challenge of Table Question Answering (TableQA) over large semi-structured tables by proposing TaDRe, a model that uses pre- and post-table decomposition refinements to improve decomposition quality, achieving state-of-the-art performance on new and public benchmarks.

Table Question Answering (TableQA) attracts strong interests due to the prevalence of web information presented in the form of semi-structured tables. Despite many efforts, TableQA over large tables remains an open challenge. This is because large tables may overwhelm models that try to comprehend them in full to locate question answers. Recent studies reduce input table size by decomposing tables into smaller, question-relevant sub-tables via generating programs to parse the tables. However, such solutions are subject to program generation and execution errors and are difficult to ensure decomposition quality. To address this issue, we propose TaDRe, a TableQA model that incorporates both pre- and post-table decomposition refinements to ensure table decomposition quality, hence achieving highly accurate TableQA results. To evaluate TaDRe, we construct two new large-table TableQA benchmarks via LLM-driven table expansion and QA pair generation. Extensive experiments on both the new and public benchmarks show that TaDRe achieves state-of-the-art performance on large-table TableQA tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes