CLApr 19, 2020

A Chinese Corpus for Fine-grained Entity Typing

Chin Lee, Hongliang Dai, Yangqiu Song, Xin Li

arXiv:2004.08825v131.1999 citationsh-index: 50Has Code

Originality Synthesis-oriented

AI Analysis

This provides a resource for researchers working on Chinese NLP, but it is incremental as it adapts an existing task to a new language.

The authors tackled the lack of datasets for fine-grained entity typing in Chinese by introducing a manually labeled corpus with 4,800 mentions, and they demonstrated its utility through experiments with neural models and cross-lingual transfer learning.

Fine-grained entity typing is a challenging task with wide applications. However, most existing datasets for this task are in English. In this paper, we introduce a corpus for Chinese fine-grained entity typing that contains 4,800 mentions manually labeled through crowdsourcing. Each mention is annotated with free-form entity types. To make our dataset useful in more possible scenarios, we also categorize all the fine-grained types into 10 general types. Finally, we conduct experiments with some neural models whose structures are typical in fine-grained entity typing and show how well they perform on our dataset. We also show the possibility of improving Chinese fine-grained entity typing through cross-lingual transfer learning.

View on arXiv PDF Code

Similar