LGAIMay 22

It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

arXiv:2605.2382551.1
Predicted impact top 69% in LG · last 90 daysOriginality Highly original
AI Analysis

For AI developers and policymakers, this reveals that geopolitical biases are actively shaped during post-training, necessitating greater transparency and auditing of alignment processes.

Geopolitical bias in LLMs originates in post-training, not pre-training, as shown by testing seven open-weight model pairs across 28 country pairs. For example, Alibaba's Qwen 2.5 base model is neutral on China-favourability (-0.15 log-odds, p=0.15), while the chat variant shows +2.91 (p<10^-4), an 18x shift in odds.

It has generally been assumed that geopolitical bias in language models originates from the training data used during the pre-training phase. We tested seven open-weight LLM pairs consisting of the base model (pre-training only) and the chat model (pre-training and post-training) from seven labs on a paired-scenario forced-choice probe over 28 country pairs in English, French, and Chinese, and found that geopolitical bias originates in post-training rather than in pre-training. Across seven AI labs, six showed shifts in the direction associated with the country or region of the model developer after post-training. This shift is strongest in Alibaba's Qwen 2.5: while the base is neutral on China-favourability (-0.15 log-odds, p=0.15), the post-trained chat variant is at +2.91 (p<10^-4), an 18x shift in odds. We also observe shifts in biases toward other countries across all models. Additionally, the magnitude of this shift depends on the language used to prompt the model: the French-made Mistral becomes pro-France only under French prompting (FR-EN shift +1.91, p<10^-4). These findings suggest that geopolitical preferences in language models are not simply inherited from large-scale internet data but are actively shaped during post-training, highlighting the need for greater transparency, auditing, and oversight of alignment processes that influence how models represent nations, cultures, and political perspectives.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes