CLNov 15, 2022

DualNER: A Dual-Teaching framework for Zero-shot Cross-lingual Named Entity Recognition

Jiali Zeng, Yufan Jiang, Yongjing Yin, Xu Wang, Binghuai Lin, Yunbo Cao

arXiv:2211.08104v224.0292 citationsh-index: 33Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of named entity recognition across languages without target language annotations, which is incremental in combining existing paradigms.

The paper tackles zero-shot cross-lingual named entity recognition by proposing DualNER, a framework that leverages both annotated source language data and unlabeled target language text, achieving improved performance as demonstrated in experiments.

We present DualNER, a simple and effective framework to make full use of both annotated source language corpus and unlabeled target language text for zero-shot cross-lingual named entity recognition (NER). In particular, we combine two complementary learning paradigms of NER, i.e., sequence labeling and span prediction, into a unified multi-task framework. After obtaining a sufficient NER model trained on the source data, we further train it on the target data in a {\it dual-teaching} manner, in which the pseudo-labels for one task are constructed from the prediction of the other task. Moreover, based on the span prediction, an entity-aware regularization is proposed to enhance the intrinsic cross-lingual alignment between the same entities in different languages. Experiments and analysis demonstrate the effectiveness of our DualNER. Code is available at https://github.com/lemon0830/dualNER.

View on arXiv PDF Code

Similar