CLApr 19, 2020

A Chinese Corpus for Fine-grained Entity Typing

arXiv:2004.08825v1999 citations
Originality Synthesis-oriented
AI Analysis

This provides a resource for researchers working on Chinese NLP, but it is incremental as it adapts an existing task to a new language.

The authors tackled the lack of datasets for fine-grained entity typing in Chinese by introducing a manually labeled corpus with 4,800 mentions, and they demonstrated its utility through experiments with neural models and cross-lingual transfer learning.

Fine-grained entity typing is a challenging task with wide applications. However, most existing datasets for this task are in English. In this paper, we introduce a corpus for Chinese fine-grained entity typing that contains 4,800 mentions manually labeled through crowdsourcing. Each mention is annotated with free-form entity types. To make our dataset useful in more possible scenarios, we also categorize all the fine-grained types into 10 general types. Finally, we conduct experiments with some neural models whose structures are typical in fine-grained entity typing and show how well they perform on our dataset. We also show the possibility of improving Chinese fine-grained entity typing through cross-lingual transfer learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes