Cross-domain Semantic Parsing via Paraphrasing
This addresses the problem of adapting semantic parsers to new domains for researchers and practitioners in NLP, though it is incremental as it builds on existing domain adaptation and paraphrasing methods.
The paper tackles cross-domain semantic parsing by formulating it as a domain adaptation problem and reducing it to paraphrasing using an attentive sequence-to-sequence model, achieving significant improvement on the Overnight dataset with eight domains.
Existing studies on semantic parsing mainly focus on the in-domain setting. We formulate cross-domain semantic parsing as a domain adaptation problem: train a semantic parser on some source domains and then adapt it to the target domain. Due to the diversity of logical forms in different domains, this problem presents unique and intriguing challenges. By converting logical forms into canonical utterances in natural language, we reduce semantic parsing to paraphrasing, and develop an attentive sequence-to-sequence paraphrase model that is general and flexible to adapt to different domains. We discover two problems, small micro variance and large macro variance, of pre-trained word embeddings that hinder their direct use in neural networks, and propose standardization techniques as a remedy. On the popular Overnight dataset, which contains eight domains, we show that both cross-domain training and standardized pre-trained word embeddings can bring significant improvement.