CLMay 21, 2025

Can Large Language Models Understand Internet Buzzwords Through User-Generated Content

arXiv:2505.15071v12 citationsh-index: 9Has CodeACL
Originality Incremental advance
AI Analysis

This work addresses the challenge of LLM comprehension of internet slang for researchers and developers in natural language processing, though it is incremental as it builds on existing definition generation methods with a new dataset and steering technique.

The paper tackles the problem of whether large language models (LLMs) can generate accurate definitions for Chinese internet buzzwords using user-generated content (UGC), and it introduces a dataset (CHEER) and a method (RESS) that improves definition accuracy, with benchmarks showing RESS's effectiveness while highlighting challenges like over-reliance on prior exposure.

The massive user-generated content (UGC) available in Chinese social media is giving rise to the possibility of studying internet buzzwords. In this paper, we study if large language models (LLMs) can generate accurate definitions for these buzzwords based on UGC as examples. Our work serves a threefold contribution. First, we introduce CHEER, the first dataset of Chinese internet buzzwords, each annotated with a definition and relevant UGC. Second, we propose a novel method, called RESS, to effectively steer the comprehending process of LLMs to produce more accurate buzzword definitions, mirroring the skills of human language learning. Third, with CHEER, we benchmark the strengths and weaknesses of various off-the-shelf definition generation methods and our RESS. Our benchmark demonstrates the effectiveness of RESS while revealing crucial shared challenges: over-reliance on prior exposure, underdeveloped inferential abilities, and difficulty identifying high-quality UGC to facilitate comprehension. We believe our work lays the groundwork for future advancements in LLM-based definition generation. Our dataset and code are available at https://github.com/SCUNLP/Buzzword.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes