CLMay 28, 2015

Overview of the NLPCC 2015 Shared Task: Chinese Word Segmentation and POS Tagging for Micro-blog Texts

arXiv:1505.07599v314 citations
Originality Synthesis-oriented
AI Analysis

It addresses the problem of processing informal Chinese text for NLP researchers, but is incremental as it focuses on a shared task overview.

This paper presents an overview of the NLPCC 2015 shared task, which tackled Chinese word segmentation and POS tagging for informal micro-blog texts, reporting test results from participating systems.

In this paper, we give an overview for the shared task at the 4th CCF Conference on Natural Language Processing \& Chinese Computing (NLPCC 2015): Chinese word segmentation and part-of-speech (POS) tagging for micro-blog texts. Different with the popular used newswire datasets, the dataset of this shared task consists of the relatively informal micro-texts. The shared task has two sub-tasks: (1) individual Chinese word segmentation and (2) joint Chinese word segmentation and POS Tagging. Each subtask has three tracks to distinguish the systems with different resources. We first introduce the dataset and task, then we characterize the different approaches of the participating systems, report the test results, and provide a overview analysis of these results. An online system is available for open registration and evaluation at http://nlp.fudan.edu.cn/nlpcc2015.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes