Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization
This is an incremental contribution, creating a benchmark for researchers working on Chinese text classification tasks.
The paper introduces the NLPCC 2017 shared task on Chinese News Headline Categorization, providing a dataset of 12,000 short texts across 18 classes and making it publicly available with example code.
In this paper, we give an overview for the shared task at the CCF Conference on Natural Language Processing \& Chinese Computing (NLPCC 2017): Chinese News Headline Categorization. The dataset of this shared task consists 18 classes, 12,000 short texts along with corresponded labels for each class. The dataset and example code can be accessed at https://github.com/FudanNLP/nlpcc2017_news_headline_categorization.