CLApr 14, 2025

C-MTCSD: A Chinese Multi-Turn Conversational Stance Detection Dataset

Fuqiang Niu, Yi Yang, Xianghua Fu, Genan Dai, Bowen Zhang

arXiv:2504.09958v25 citationsh-index: 7WWW

Originality Synthesis-oriented

AI Analysis

This work addresses a gap in Chinese language processing for social media analysis, providing a challenging benchmark for researchers, though it is incremental as it focuses on dataset creation rather than novel methods.

The authors tackled the lack of large-scale datasets for Chinese multi-turn conversational stance detection by introducing C-MTCSD, a dataset with 24,264 annotated instances from Sina Weibo, which is 4.2 times larger than prior datasets, and found that even state-of-the-art models achieve only 64.07% F1 score in zero-shot settings, with performance degrading as conversation depth increases.

Stance detection has become an essential tool for analyzing public discussions on social media. Current methods face significant challenges, particularly in Chinese language processing and multi-turn conversational analysis. To address these limitations, we introduce C-MTCSD, the largest Chinese multi-turn conversational stance detection dataset, comprising 24,264 carefully annotated instances from Sina Weibo, which is 4.2 times larger than the only prior Chinese conversational stance detection dataset. Our comprehensive evaluation using both traditional approaches and large language models reveals the complexity of C-MTCSD: even state-of-the-art models achieve only 64.07% F1 score in the challenging zero-shot setting, while performance consistently degrades with increasing conversation depth. Traditional models particularly struggle with implicit stance detection, achieving below 50% F1 score. This work establishes a challenging new benchmark for Chinese stance detection research, highlighting significant opportunities for future improvements.

View on arXiv PDF

Similar