Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing
This addresses the annotation bottleneck for researchers and practitioners in semantic parsing, offering a method to reduce human labor, though it appears incremental as it builds on existing two-stage and unsupervised approaches.
The paper tackles the scarcity of annotations in semantic parsing by proposing a two-stage framework that uses an unsupervised paraphrase model to convert unlabeled natural language into canonical utterances, followed by a naive semantic parser to produce logical forms, achieving effectiveness on benchmarks like Overnight and GeoGranno.
One daunting problem for semantic parsing is the scarcity of annotation. Aiming to reduce nontrivial human labor, we propose a two-stage semantic parsing framework, where the first stage utilizes an unsupervised paraphrase model to convert an unlabeled natural language utterance into the canonical utterance. The downstream naive semantic parser accepts the intermediate output and returns the target logical form. Furthermore, the entire training process is split into two phases: pre-training and cycle learning. Three tailored self-supervised tasks are introduced throughout training to activate the unsupervised paraphrase model. Experimental results on benchmarks Overnight and GeoGranno demonstrate that our framework is effective and compatible with supervised training.