Neural Cross-Lingual Named Entity Recognition with Minimal Resources
It addresses the problem of enabling NER for low-resource languages, benefiting NLP applications in such contexts, but is incremental as it builds on existing cross-lingual transfer methods.
The paper tackles cross-lingual named entity recognition for languages with no annotated resources by using bilingual word embeddings and self-attention to improve lexical mapping and handle word order differences, achieving state-of-the-art or competitive performance with lower resource requirements.
For languages with no annotated resources, unsupervised transfer of natural language processing models such as named-entity recognition (NER) from resource-rich languages would be an appealing capability. However, differences in words and word order across languages make it a challenging problem. To improve mapping of lexical items across languages, we propose a method that finds translations based on bilingual word embeddings. To improve robustness to word order differences, we propose to use self-attention, which allows for a degree of flexibility with respect to word order. We demonstrate that these methods achieve state-of-the-art or competitive NER performance on commonly tested languages under a cross-lingual setting, with much lower resource requirements than past approaches. We also evaluate the challenges of applying these methods to Uyghur, a low-resource language.