CLApr 20

Exploring the Capability Boundaries of LLMs in Mastering of Chinese Chouxiang Language

arXiv:2604.1584128.4h-index: 4
AI Analysis

For NLP researchers, this work highlights the limitations of LLMs on a specific subcultural internet language, promoting research on multicultural integration.

This paper introduces Mouse, a benchmark for evaluating LLMs on Chinese Chouxiang Language across six tasks, finding that current SOTA LLMs perform well on contextual semantic understanding but show clear limitations on other tasks.

While large language models (LLMs) have achieved remarkable success in general language tasks, their performance on Chouxiang Language, a representative subcultural language in the Chinese internet context, remains largely unexplored. In this paper, we introduce Mouse, a specialized benchmark designed to evaluate the capabilities of LLMs on NLP tasks involving Chouxiang Language across six tasks. Experimental results show that, current state-of-the-art (SOTA) LLMs exhibit clear limitations on multiple tasks, while performing well on tasks that involve contextual semantic understanding. In addition, we further discuss the reasons behind the generally low performance of SOTA LLMs on Chouxiang Language, examine whether the LLM-as-a-judge approach adopted for translation tasks aligns with human judgments and values, and analyze the key factors that influence Chouxiang translation. Our study aims to promote further research in the NLP community on multicultural integration and the dynamics of evolving internet languages. Our code and data are publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes