CLAIMAJun 20, 2025

Cash or Comfort? How LLMs Value Your Inconvenience

arXiv:2506.17367v11 citationsh-index: 4
Originality Incremental advance
AI Analysis

This highlights critical flaws in using current LLMs as decision-making assistants for personal scenarios involving cash-versus-comfort trade-offs, which is an incremental but important step for AI safety and alignment.

The study quantified the prices assigned by multiple LLMs to user discomforts like walking, waiting, hunger, and pain, revealing large variances between models, fragility to prompt phrasing, and unreasonable valuations such as accepting 1 Euro for a 10-hour wait.

Large Language Models (LLMs) are increasingly proposed as near-autonomous artificial intelligence (AI) agents capable of making everyday decisions on behalf of humans. Although LLMs perform well on many technical tasks, their behaviour in personal decision-making remains less understood. Previous studies have assessed their rationality and moral alignment with human decisions. However, the behaviour of AI assistants in scenarios where financial rewards are at odds with user comfort has not yet been thoroughly explored. In this paper, we tackle this problem by quantifying the prices assigned by multiple LLMs to a series of user discomforts: additional walking, waiting, hunger and pain. We uncover several key concerns that strongly question the prospect of using current LLMs as decision-making assistants: (1) a large variance in responses between LLMs, (2) within a single LLM, responses show fragility to minor variations in prompt phrasing (e.g., reformulating the question in the first person can considerably alter the decision), (3) LLMs can accept unreasonably low rewards for major inconveniences (e.g., 1 Euro to wait 10 hours), and (4) LLMs can reject monetary gains where no discomfort is imposed (e.g., 1,000 Euro to wait 0 minutes). These findings emphasize the need for scrutiny of how LLMs value human inconvenience, particularly as we move toward applications where such cash-versus-comfort trade-offs are made on users' behalf.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes