CLAIMar 12

Just Use XML: Revisiting Joint Translation and Label Projection

arXiv:2603.12021v123.8h-index: 5
Predicted impact top 63% in CL · last 90 daysOriginality Incremental advance
AI Analysis

This addresses cross-lingual transfer for low-resource languages by enabling efficient label projection without compromising translation, though it is incremental as it builds on existing label projection techniques.

The paper tackles the problem of degraded translation quality in joint translation and label projection by introducing LabelPigeon, a framework using XML tags, which outperforms baselines and improves translation quality in 11 languages, with gains up to +39.9 F1 on NER in cross-lingual transfer.

Label projection is an effective technique for cross-lingual transfer, extending span-annotated datasets from a high-resource language to low-resource ones. Most approaches perform label projection as a separate step after machine translation, and prior work that combines the two reports degraded translation quality. We re-evaluate this claim with LabelPigeon, a novel framework that jointly performs translation and label projection via XML tags. We design a direct evaluation scheme for label projection, and find that LabelPigeon outperforms baselines and actively improves translation quality in 11 languages. We further assess translation quality across 203 languages and varying annotation complexity, finding consistent improvement attributed to additional fine-tuning. Finally, across 27 languages and three downstream tasks, we report substantial gains in cross-lingual transfer over comparable work, up to +39.9 F1 on NER. Overall, our results demonstrate that XML-tagged label projection provides effective and efficient label transfer without compromising translation quality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes