LGAIMar 17, 2023

Generate, Transform, Answer: Question Specific Tool Synthesis for Tabular Data

arXiv:2303.10138v114 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in TQA for users handling large semi-structured data, offering an incremental improvement by combining programmatic tools with neural components.

The paper tackles the problem of information loss in tabular question answering (TQA) when language models process large tables directly, by proposing ToolWriter to generate query-specific row-filtering programs that transform tables before processing. This approach improves state-of-the-art performance on WikiTableQuestions and WikiSQL benchmarks, with the most gains on long tables.

Tabular question answering (TQA) presents a challenging setting for neural systems by requiring joint reasoning of natural language with large amounts of semi-structured data. Unlike humans who use programmatic tools like filters to transform data before processing, language models in TQA process tables directly, resulting in information loss as table size increases. In this paper we propose ToolWriter to generate query specific programs and detect when to apply them to transform tables and align them with the TQA model's capabilities. Focusing ToolWriter to generate row-filtering tools improves the state-of-the-art for WikiTableQuestions and WikiSQL with the most performance gained on long tables. By investigating headroom, our work highlights the broader potential for programmatic tools combined with neural components to manipulate large amounts of structured data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes