CLJan 16, 2018

Adversarial Learning for Chinese NER from Crowd Annotations

arXiv:1801.05147v134 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of obtaining high-quality labeled data from non-experts for Chinese NER, offering an incremental improvement over existing methods.

The paper tackles the problem of noisy crowd annotations for Chinese Named Entity Recognition by proposing an adversarial learning approach that separates annotator-generic and specific information, achieving better scores than strong baselines on two domain-specific datasets.

To quickly obtain new labeled data, we can choose crowdsourcing as an alternative way at lower cost in a short time. But as an exchange, crowd annotations from non-experts may be of lower quality than those from experts. In this paper, we propose an approach to performing crowd annotation learning for Chinese Named Entity Recognition (NER) to make full use of the noisy sequence labels from multiple annotators. Inspired by adversarial learning, our approach uses a common Bi-LSTM and a private Bi-LSTM for representing annotator-generic and -specific information. The annotator-generic information is the common knowledge for entities easily mastered by the crowd. Finally, we build our Chinese NE tagger based on the LSTM-CRF model. In our experiments, we create two data sets for Chinese NER tasks from two domains. The experimental results show that our system achieves better scores than strong baseline systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes