CLSep 9, 2019

What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis

arXiv:1909.03598v11008 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of named entity recognition for languages with limited training data, offering incremental insights into transfer mechanisms.

The paper tackles the problem of building named entity recognition models for low-resource languages by analyzing cross-lingual transfer, finding that sequential order and multilingual embeddings are key factors, with competitive performance achieved in experiments.

Building named entity recognition (NER) models for languages that do not have much training data is a challenging task. While recent work has shown promising results on cross-lingual transfer from high-resource languages to low-resource languages, it is unclear what knowledge is transferred. In this paper, we first propose a simple and efficient neural architecture for cross-lingual NER. Experiments show that our model achieves competitive performance with the state-of-the-art. We further analyze how transfer learning works for cross-lingual NER on two transferable factors: sequential order and multilingual embeddings, and investigate how model performance varies across entity lengths. Finally, we conduct a case-study on a non-Latin language, Bengali, which suggests that leveraging knowledge from Wikipedia will be a promising direction to further improve the model performances. Our results can shed light on future research for improving cross-lingual NER.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes