CLMar 19, 2018

Polyglot Semantic Parsing in APIs

Kyle Richardson, Jonathan Berant, Jonas Kuhn

arXiv:1803.06966v232.01096 citationsHas Code

Originality Incremental advance

AI Analysis

This enables parsing diverse natural and programming languages with a single model, addressing a domain-specific need for more versatile semantic parsers.

The paper tackles the problem of training semantic parsing models on multiple datasets and languages, focusing on translating text to code signatures, and achieves state-of-the-art performance on software component datasets.

Traditional approaches to semantic parsing (SP) work by training individual models for each available parallel dataset of text-meaning pairs. In this paper, we explore the idea of polyglot semantic translation, or learning semantic parsing models that are trained on multiple datasets and natural languages. In particular, we focus on translating text to code signature representations using the software component datasets of Richardson and Kuhn (2017a,b). The advantage of such models is that they can be used for parsing a wide variety of input natural languages and output programming languages, or mixed input languages, using a single unified model. To facilitate modeling of this type, we develop a novel graph-based decoding framework that achieves state-of-the-art performance on the above datasets, and apply this method to two other benchmark SP tasks.

View on arXiv PDF Code

Similar