CLApr 20

Syntax as a Rosetta Stone: Universal Dependencies for In-Context Coptic Translation

arXiv:2604.1875877.5h-index: 2
Predicted impact top 77% in CL · last 90 daysOriginality Incremental advance
AI Analysis

For researchers working on low-resource machine translation, this work demonstrates that syntactic augmentation can complement lexical resources to improve translation quality, though the gains are incremental over existing dictionary-based methods.

This paper introduces a novel in-context learning approach for low-resource Coptic-to-English translation that augments input with syntactic information from Universal Dependencies. Combining syntactic parses with dictionary-based glosses achieves new state-of-the-art results, with significant gains across model sizes.

Low-resource machine translation requires methods that differ from those used for high-resource languages. This paper proposes a novel in-context learning approach to support low-resource machine translation of the Coptic language to English, with syntactic augmentation from Universal Dependencies parses of input sentences. Building on existing work using bilingual dictionaries to support inference for vocabulary items, we add several representations of syntactic analyses to our inputs , specifically exploring the inclusion of raw parser outputs, verbalizations of parses in plain English, and targeted instructions of difficult constructions identified in sub-trees and how they can be translated. Our results show that while syntactic information alone is not as useful as dictionary-based glosses, combining retrieved dictionary items with syntactic information achieves significant gains across model sizes, achieving new state-of-the-art translation results for Coptic.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes