CLDec 4, 2021

Hierarchical Neural Data Synthesis for Semantic Parsing

arXiv:2112.02212v115 citations
Originality Incremental advance
AI Analysis

This addresses data scarcity for semantic parsing researchers and practitioners, offering a scalable solution without requiring linguistic expertise, though it is incremental as it builds on existing neural methods.

The paper tackles the problem of expensive data collection and limited scalability in semantic parsing by proposing a neural data augmentation method that eliminates grammar engineering, achieving state-of-the-art performance with 77.2% accuracy on the Spider cross-domain text-to-SQL benchmark in a zero-shot setting.

Semantic parsing datasets are expensive to collect. Moreover, even the questions pertinent to a given domain, which are the input of a semantic parsing system, might not be readily available, especially in cross-domain semantic parsing. This makes data augmentation even more challenging. Existing methods to synthesize new data use hand-crafted or induced rules, requiring substantial engineering effort and linguistic expertise to achieve good coverage and precision, which limits the scalability. In this work, we propose a purely neural approach of data augmentation for semantic parsing that completely removes the need for grammar engineering while achieving higher semantic parsing accuracy. Furthermore, our method can synthesize in the zero-shot setting, where only a new domain schema is available without any input-output examples of the new domain. On the Spider cross-domain text-to-SQL semantic parsing benchmark, we achieve the state-of-the-art performance on the development set (77.2% accuracy) using our zero-shot augmentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes