Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
This addresses the challenge of limited labeled data for NER in low-resource languages, offering a domain-specific solution that is incremental in leveraging existing translation and pretraining methods.
The paper tackles the problem of poor named entity recognition (NER) performance in low-resource languages by proposing Converse Attention Network (CAN), which transfers knowledge from pretrained high-resource English models using attention-based translation and alignment, achieving consistent and significant performance improvements on four low-resource NER datasets.
In recent years, great success has been achieved in many tasks of natural language processing (NLP), e.g., named entity recognition (NER), especially in the high-resource language, i.e., English, thanks in part to the considerable amount of labeled resources. However, most low-resource languages do not have such an abundance of labeled data as high-resource English, leading to poor performance of NER in these low-resource languages. Inspired by knowledge transfer, we propose Converse Attention Network, or CAN in short, to improve the performance of NER in low-resource languages by leveraging the knowledge learned in pretrained high-resource English models. CAN first translates low-resource languages into high-resource English using an attention based translation module. In the process of translation, CAN obtain the attention matrices that align the two languages. Furthermore, CAN use the attention matrices to align the high-resource semantic features from a pretrained high-resource English model with the low-resource semantic features. As a result, CAN obtains aligned high-resource semantic features to enrich the representations of low-resource languages. Experiments on four low-resource NER datasets show that CAN achieves consistent and significant performance improvements, which indicates the effectiveness of CAN.