CLFeb 13, 2025

Can Uniform Meaning Representation Help GPT-4 Translate from Indigenous Languages?

arXiv:2502.08900v26.72 citationsh-index: 9ACL

Originality Incremental advance

AI Analysis

This addresses translation challenges for indigenous language communities, but it is incremental as it builds on existing UMR and GPT-4 methods.

The study tackled the problem of GPT-4 struggling with translation from indigenous languages by incorporating Uniform Meaning Representation (UMR) into prompts, finding that UMR integration led to statistically significant performance improvements in most test cases.

While ChatGPT and GPT-based models are able to effectively perform many tasks without additional fine-tuning, they struggle with tasks related to extremely low-resource languages and indigenous languages. Uniform Meaning Representation (UMR), a semantic representation designed to capture the meaning of texts in many languages, is well-positioned to be leveraged in the development of low-resource language technologies. In this work, we explore the downstream utility of UMR for low-resource languages by incorporating it into GPT-4 prompts. Specifically, we examine the ability of GPT-4 to perform translation from three indigenous languages (Navajo, Arápaho, and Kukama), with and without demonstrations, as well as with and without UMR annotations. Ultimately, we find that in the majority of our test cases, integrating UMR into the prompt results in a statistically significant increase in performance, which is a promising indication of future applications of the UMR formalism.

View on arXiv PDF

Similar