CL AI LGSep 9, 2021

Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data

Massimo Nicosia, Zhongdi Qu, Yasemin Altun

arXiv:2109.04319v130.8662 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of limited target language supervision for multilingual semantic parsing, offering an incremental improvement over existing data augmentation techniques.

The paper tackles the performance gap in zero-shot multilingual semantic parsing by proposing a Translate-and-Fill method to generate synthetic training data, achieving competitive accuracies on three datasets compared to traditional alignment-based systems.

While multilingual pretrained language models (LMs) fine-tuned on a single language have shown substantial cross-lingual task transfer capabilities, there is still a wide performance gap in semantic parsing tasks when target language supervision is available. In this paper, we propose a novel Translate-and-Fill (TaF) method to produce silver training data for a multilingual semantic parser. This method simplifies the popular Translate-Align-Project (TAP) pipeline and consists of a sequence-to-sequence filler model that constructs a full parse conditioned on an utterance and a view of the same parse. Our filler is trained on English data only but can accurately complete instances in other languages (i.e., translations of the English training utterances), in a zero-shot fashion. Experimental results on three multilingual semantic parsing datasets show that data augmentation with TaF reaches accuracies competitive with similar systems which rely on traditional alignment techniques.

View on arXiv PDF

Similar