CY CLMay 3

The Invisible Coalition Partner: How LLMs Vote When Democracy Gets Concrete

arXiv:2606.0004842.8h-index: 1

Predicted impact top 81% in CY · last 90 daysOriginality Incremental advance

AI Analysis

For researchers and policymakers evaluating LLM political bias, this work demonstrates that abstract surveys are poor proxies for real-world decision-making, revealing a centrist, status-quo-favoring behavior instead.

Prior research found left-of-center bias in LLMs using abstract questionnaires, but this study shows the bias does not generalize to concrete policy decisions. Testing 66 LLMs on Swiss referenda, models aligned with centrist parties rather than leftist ones, exhibited language-dependent voting (50-98% consistency), and some showed change-aversion (83-94% 'No' votes).

Prior research has established that instruction-tuned large language models exhibit left-of-center political bias, measured exclusively through abstract political questionnaires. We show that this finding does not generalize to concrete policy decisions. We introduce a dual-instrument methodology grounded in Swiss democratic reality. The Smartvote questionnaire (75 abstract policy questions) is administered to 66 LLMs from 27 model families and compared to 184 elected members of the Swiss National Council, replicating the established leftward convergence (Cohen's d = 3.64, p = 0.0002). Then, novel to this work, 9 flagship LLMs are confronted with 48 real federal referenda (Volksabstimmungen) in four national languages (German, French, Italian, Romansh) under three information conditions, comparing votes to actual outcomes and party recommendations (Parolen). Three findings challenge the prevailing narrative. (1) Abstract questionnaires do not predict concrete behavior: the left-to-right agreement gradient on Smartvote shifts from left-peaked to center-peaked on Volksabstimmungen, where models align most with centrist Die Mitte and FDP rather than leftist SP and Gruene (Wilcoxon p = 0.008). (2) For some models, the language of a political question changes the answer more than the political content does: cross-linguistic consistency ranges from 50% (Mistral) to 98% (GPT-5.4). (3) Two models exhibit systematic change-aversion rather than political bias, voting Nein on 83-94% of referenda regardless of direction (binomial p < 0.0001). What prior work measured as "leftward bias" may not generalize beyond abstract instruments. On concrete policy decisions, LLMs behave less like coalition partners of the left and more like cautious civil servants: centrist, status-quo-favoring, and inconsistent across languages.

View on arXiv PDF

Similar