CLSIApr 12, 2025

Word Embeddings Track Social Group Changes Across 70 Years in China

arXiv:2504.12327v12 citationsh-index: 11CogSci
Originality Synthesis-oriented
AI Analysis

This work advances computational social science by providing a non-Western perspective on how official discourse encodes social structure, though it is incremental in applying existing methods to new data.

The researchers analyzed Chinese state-controlled media from 1950 to 2019 to understand how social group representations evolve during revolutionary transformations, finding that stereotypes of ethnicity, age, and body type remained stable, while gender and economic class representations shifted dramatically with historical changes.

Language encodes societal beliefs about social groups through word patterns. While computational methods like word embeddings enable quantitative analysis of these patterns, studies have primarily examined gradual shifts in Western contexts. We present the first large-scale computational analysis of Chinese state-controlled media (1950-2019) to examine how revolutionary social transformations are reflected in official linguistic representations of social groups. Using diachronic word embeddings at multiple temporal resolutions, we find that Chinese representations differ significantly from Western counterparts, particularly regarding economic status, ethnicity, and gender. These representations show distinct evolutionary dynamics: while stereotypes of ethnicity, age, and body type remain remarkably stable across political upheavals, representations of gender and economic classes undergo dramatic shifts tracking historical transformations. This work advances our understanding of how officially sanctioned discourse encodes social structure through language while highlighting the importance of non-Western perspectives in computational social science.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes