CLJan 16, 2018

Adversarial Learning for Chinese NER from Crowd Annotations

YaoSheng Yang, Meishan Zhang, Wenliang Chen, Wei Zhang, Haofen Wang, Min Zhang

arXiv:1801.05147v10.834 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of obtaining high-quality labeled data from non-experts for Chinese NER, offering an incremental improvement over existing methods.

The paper tackles the problem of noisy crowd annotations for Chinese Named Entity Recognition by proposing an adversarial learning approach that separates annotator-generic and specific information, achieving better scores than strong baselines on two domain-specific datasets.

To quickly obtain new labeled data, we can choose crowdsourcing as an alternative way at lower cost in a short time. But as an exchange, crowd annotations from non-experts may be of lower quality than those from experts. In this paper, we propose an approach to performing crowd annotation learning for Chinese Named Entity Recognition (NER) to make full use of the noisy sequence labels from multiple annotators. Inspired by adversarial learning, our approach uses a common Bi-LSTM and a private Bi-LSTM for representing annotator-generic and -specific information. The annotator-generic information is the common knowledge for entities easily mastered by the crowd. Finally, we build our Chinese NE tagger based on the LSTM-CRF model. In our experiments, we create two data sets for Chinese NER tasks from two domains. The experimental results show that our system achieves better scores than strong baseline systems.

View on arXiv PDF

Similar