Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values
This addresses alignment issues for LLMs in high-stakes ethical decisions like organ allocation, though it is incremental as it builds on existing fine-tuning methods.
The study evaluated LLMs in kidney allocation decisions, finding they deviate from human moral values and rarely express indecision, but fine-tuning with few samples improved consistency and indecision modeling.
The rapid integration of Large Language Models (LLMs) in high-stakes decision-making -- such as allocating scarce resources like donor organs -- raises critical questions about their alignment with human moral values. We systematically evaluate the behavior of several prominent LLMs against human preferences in kidney allocation scenarios and show that LLMs: i) exhibit stark deviations from human values in prioritizing various attributes, and ii) in contrast to humans, LLMs rarely express indecision, opting for deterministic decisions even when alternative indecision mechanisms (e.g., coin flipping) are provided. Nonetheless, we show that low-rank supervised fine-tuning with few samples is often effective in improving both decision consistency and calibrating indecision modeling. These findings illustrate the necessity of explicit alignment strategies for LLMs in moral/ethical domains.