CLSep 1, 2025

Zero-shot Cross-lingual NER via Mitigating Language Difference: An Entity-aligned Translation Perspective

Zhihao Zhang, Sophia Yat Mei Lee, Dong Zhang, Shoushan Li, Guodong Zhou

arXiv:2509.01147v14.91 citationsh-index: 5EMNLP

Originality Incremental advance

AI Analysis

This addresses the challenge of transferring NER knowledge to low-resource non-Latin script languages, which is an incremental improvement over existing methods focused on Latin script languages.

The paper tackles the problem of zero-shot cross-lingual named entity recognition (CL-NER) for non-Latin script languages like Chinese and Japanese, where performance degrades due to structural differences, by proposing an entity-aligned translation (EAT) approach using large language models and fine-tuning with multilingual Wikipedia data to align entities between languages.

Cross-lingual Named Entity Recognition (CL-NER) aims to transfer knowledge from high-resource languages to low-resource languages. However, existing zero-shot CL-NER (ZCL-NER) approaches primarily focus on Latin script language (LSL), where shared linguistic features facilitate effective knowledge transfer. In contrast, for non-Latin script language (NSL), such as Chinese and Japanese, performance often degrades due to deep structural differences. To address these challenges, we propose an entity-aligned translation (EAT) approach. Leveraging large language models (LLMs), EAT employs a dual-translation strategy to align entities between NSL and English. In addition, we fine-tune LLMs using multilingual Wikipedia data to enhance the entity alignment from source to target languages.

View on arXiv PDF

Similar