CLNov 10, 2020

Multilingual AMR-to-Text Generation

arXiv:2011.05443v131.1995 citationsh-index: 28

Originality Incremental advance

AI Analysis

This work addresses the problem of multilingual text generation from structured data for natural language processing applications, representing an incremental advance by extending existing AMR-to-text methods to multiple languages.

The paper tackled the problem of generating text from Abstract Meaning Representations (AMRs) in multiple languages, overcoming challenges like varied word order and morphology, and achieved results where multilingual models surpassed single-language baselines for eighteen out of twenty-one languages based on automatic metrics, with human evaluation confirming fluency.

Generating text from structured data is challenging because it requires bridging the gap between (i) structure and natural language (NL) and (ii) semantically underspecified input and fully specified NL output. Multilingual generation brings in an additional challenge: that of generating into languages with varied word order and morphological properties. In this work, we focus on Abstract Meaning Representations (AMRs) as structured input, where previous research has overwhelmingly focused on generating only into English. We leverage advances in cross-lingual embeddings, pretraining, and multilingual models to create multilingual AMR-to-text models that generate in twenty one different languages. For eighteen languages, based on automatic metrics, our multilingual models surpass baselines that generate into a single language. We analyse the ability of our multilingual models to accurately capture morphology and word order using human evaluation, and find that native speakers judge our generations to be fluent.

View on arXiv PDF

Similar