AI CLJan 8

The Persona Paradox: Medical Personas as Behavioral Priors in Clinical Language Models

Tassallah Abdullahi, Shrestha Ghosh, Hamish S Fraser, Daniel León Tramontini, Adeel Abbasi, Ghada Bourjeily, Carsten Eickhoff, Ritambhara Singh

arXiv:2601.05376v16.02 citationsh-index: 8Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of ensuring safe and effective AI decision-making in healthcare by revealing that persona conditioning is not a monotonic safety guarantee, which is crucial for clinicians and developers but incremental in its methodological approach.

The study investigated how medical personas affect clinical language models, finding that they improve accuracy and calibration by up to ~20% in critical care tasks but degrade performance similarly in primary-care settings, with interaction styles influencing risk behavior in a model-dependent way.

Persona conditioning can be viewed as a behavioral prior for large language models (LLMs) and is often assumed to confer expertise and improve safety in a monotonic manner. However, its effects on high-stakes clinical decision-making remain poorly characterized. We systematically evaluate persona-based control in clinical LLMs, examining how professional roles (e.g., Emergency Department physician, nurse) and interaction styles (bold vs.\ cautious) influence behavior across models and medical tasks. We assess performance on clinical triage and patient-safety tasks using multidimensional evaluations that capture task accuracy, calibration, and safety-relevant risk behavior. We find systematic, context-dependent, and non-monotonic effects: Medical personas improve performance in critical care tasks, yielding gains of up to $\sim+20\%$ in accuracy and calibration, but degrade performance in primary-care settings by comparable margins. Interaction style modulates risk propensity and sensitivity, but it's highly model-dependent. While aggregated LLM-judge rankings favor medical over non-medical personas in safety-critical cases, we found that human clinicians show moderate agreement on safety compliance (average Cohen's $κ= 0.43$) but indicate a low confidence in 95.9\% of their responses on reasoning quality. Our work shows that personas function as behavioral priors that introduce context-dependent trade-offs rather than guarantees of safety or expertise. The code is available at https://github.com/rsinghlab/Persona\_Paradox.

View on arXiv PDF Code

Similar