CYLGJan 25, 2025

Fairness in LLM-Generated Surveys

arXiv:2501.15351v13 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses fairness issues in LLMs for survey applications, highlighting biases that affect global applicability, though it is incremental in proposing a measurement framework.

The study tackled the problem of biases in LLM-generated surveys across diverse populations by analyzing data from Chile and the U.S., finding that LLMs consistently performed better on U.S. datasets due to U.S.-centric training data, with performance disparities linked to different socio-demographic factors in each country.

Large Language Models (LLMs) excel in text generation and understanding, especially in simulating socio-political and economic patterns, serving as an alternative to traditional surveys. However, their global applicability remains questionable due to unexplored biases across socio-demographic and geographic contexts. This study examines how LLMs perform across diverse populations by analyzing public surveys from Chile and the United States, focusing on predictive accuracy and fairness metrics. The results show performance disparities, with LLM consistently outperforming on U.S. datasets. This bias originates from the U.S.-centric training data, remaining evident after accounting for socio-demographic differences. In the U.S., political identity and race significantly influence prediction accuracy, while in Chile, gender, education, and religious affiliation play more pronounced roles. Our study presents a novel framework for measuring socio-demographic biases in LLMs, offering a path toward ensuring fairer and more equitable model performance across diverse socio-cultural contexts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes