CYAIJan 17, 2025

Bias in Decision-Making for AI's Ethical Dilemmas: A Comparative Study of ChatGPT and Claude

arXiv:2501.10484v52 citationsh-index: 2Has Code
Originality Synthesis-oriented
AI Analysis

It addresses fairness issues in LLM decision-making for AI ethics, providing a systematic evaluation approach, though it is incremental in applying existing bias analysis methods to new models and scenarios.

This study evaluated nine large language models (LLMs) on ethical dilemmas involving protected attributes, revealing significant biases across all models, with open-source LLMs showing stronger preferences for marginalized groups in harmful scenarios and closed-source models favoring mainstream groups in protective ones.

Recent advances in Large Language Models (LLMs) have enabled human-like responses across various tasks, raising questions about their ethical decision-making capabilities and potential biases. This study systematically evaluates how nine popular LLMs (both open-source and closed-source) respond to ethical dilemmas involving protected attributes. Across 50,400 trials spanning single and intersectional attribute combinations in four dilemma scenarios (protective vs. harmful), we assess models' ethical preferences, sensitivity, stability, and clustering patterns. Results reveal significant biases in protected attributes in all models, with differing preferences depending on model type and dilemma context. Notably, open-source LLMs show stronger preferences for marginalized groups and greater sensitivity in harmful scenarios, while closed-source models are more selective in protective situations and tend to favor mainstream groups. We also find that ethical behavior varies across dilemma types: LLMs maintain consistent patterns in protective scenarios but respond with more diverse and cognitively demanding decisions in harmful ones. Furthermore, models display more pronounced ethical tendencies under intersectional conditions than in single-attribute settings, suggesting that complex inputs reveal deeper biases. These findings highlight the need for multi-dimensional, context-aware evaluation of LLMs' ethical behavior and offer a systematic evaluation and approach to understanding and addressing fairness in LLM decision-making.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes