Data Recombination for Neural Semantic Parsing
This addresses the challenge of modeling logical regularities in semantic parsing for natural language processing applications, representing an incremental improvement with specific gains.
The paper tackles the problem of neural semantic parsing lacking task-specific prior knowledge by introducing data recombination, a framework that injects structural knowledge via a grammar and an attention-based copying mechanism, achieving new state-of-the-art performance on the GeoQuery dataset.
Modeling crisp logical regularities is crucial in semantic parsing, making it difficult for neural models with no task-specific prior knowledge to achieve good results. In this paper, we introduce data recombination, a novel framework for injecting such prior knowledge into a model. From the training data, we induce a high-precision synchronous context-free grammar, which captures important conditional independence properties commonly found in semantic parsing. We then train a sequence-to-sequence recurrent network (RNN) model with a novel attention-based copying mechanism on datapoints sampled from this grammar, thereby teaching the model about these structural properties. Data recombination improves the accuracy of our RNN model on three semantic parsing datasets, leading to new state-of-the-art performance on the standard GeoQuery dataset for models with comparable supervision.