Question Generation from SQL Queries Improves Neural Semantic Parsing
This work addresses the data efficiency problem for researchers and practitioners in natural language processing, particularly in semantic parsing, and is incremental as it builds on existing methods with a novel application.
The paper tackled the problem of reducing the amount of supervised training data needed for state-of-the-art semantic parsing by using question generation from SQL queries, achieving a state-of-the-art neural parser with 30% of the data and further improving it with full data, while observing a logarithmic relationship between accuracy and data amount.
We study how to learn a semantic parser of state-of-the-art accuracy with less supervised training data. We conduct our study on WikiSQL, the largest hand-annotated semantic parsing dataset to date. First, we demonstrate that question generation is an effective method that empowers us to learn a state-of-the-art neural network based semantic parser with thirty percent of the supervised training data. Second, we show that applying question generation to the full supervised training data further improves the state-of-the-art model. In addition, we observe that there is a logarithmic relationship between the accuracy of a semantic parser and the amount of training data.