CL AI HC LGSep 18, 2025

Gender and Political Bias in Large Language Models: A Demonstration Platform

Wenjie Lin, Hange Liu, Xutao Mao, Yingying Zhuang, Jingwei Shi, Xudong Han, Tianyu Shi, Jinrui Yang

arXiv:2509.16264v24.91 citationsh-index: 15

Originality Synthesis-oriented

AI Analysis

This work addresses bias issues in LLMs for researchers, educators, and the public interested in political analysis, though it is incremental as it builds on existing benchmarks and tools.

The researchers tackled the problem of bias in large language models (LLMs) by creating ParlAI Vote, an interactive platform for exploring European Parliament debates and analyzing LLM performance on vote prediction and gender classification, revealing systematic performance biases in state-of-the-art models.

We present ParlAI Vote, an interactive system for exploring European Parliament debates and votes, and for testing LLMs on vote prediction and bias analysis. This platform connects debate topics, speeches, and roll-call outcomes, and includes rich demographic data such as gender, age, country, and political group. Users can browse debates, inspect linked speeches, compare real voting outcomes with predictions from frontier LLMs, and view error breakdowns by demographic group. Visualizing the EuroParlVote benchmark and its core tasks of gender classification and vote prediction, ParlAI Vote highlights systematic performance bias in state-of-the-art LLMs. The system unifies data, models, and visual analytics in a single interface, lowering the barrier for reproducing findings, auditing behavior, and running counterfactual scenarios. It supports research, education, and public engagement with legislative decision-making, while making clear both the strengths and the limitations of current LLMs in political analysis.

View on arXiv PDF

Similar