CLNov 15, 2022

DualNER: A Dual-Teaching framework for Zero-shot Cross-lingual Named Entity Recognition

arXiv:2211.08104v2292 citationsh-index: 33Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of named entity recognition across languages without target language annotations, which is incremental in combining existing paradigms.

The paper tackles zero-shot cross-lingual named entity recognition by proposing DualNER, a framework that leverages both annotated source language data and unlabeled target language text, achieving improved performance as demonstrated in experiments.

We present DualNER, a simple and effective framework to make full use of both annotated source language corpus and unlabeled target language text for zero-shot cross-lingual named entity recognition (NER). In particular, we combine two complementary learning paradigms of NER, i.e., sequence labeling and span prediction, into a unified multi-task framework. After obtaining a sufficient NER model trained on the source data, we further train it on the target data in a {\it dual-teaching} manner, in which the pseudo-labels for one task are constructed from the prediction of the other task. Moreover, based on the span prediction, an entity-aware regularization is proposed to enhance the intrinsic cross-lingual alignment between the same entities in different languages. Experiments and analysis demonstrate the effectiveness of our DualNER. Code is available at https://github.com/lemon0830/dualNER.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes