AI LGSep 27, 2025

Risk Profiling and Modulation for LLMs

arXiv:2509.23058v31 citationsh-index: 7

Originality Incremental advance

AI Analysis

This work addresses the underexplored risk behavior in LLMs for decision-making tasks, providing incremental insights into behavioral alignment.

The paper tackled the problem of understanding and modulating risk profiles in large language models (LLMs) under uncertainty, finding that post-training methods like RLHF cause deviations from standard utility models and that post-training is the most effective for stable risk modulation.

Large language models (LLMs) are increasingly used for decision-making tasks under uncertainty; however, their risk profiles and how they are influenced by prompting and alignment methods remain underexplored. Existing studies have primarily examined personality prompting or multi-agent interactions, leaving open the question of how post-training influences the risk behavior of LLMs. In this work, we propose a new pipeline for eliciting, steering, and modulating LLMs' risk profiles, drawing on tools from behavioral economics and finance. Using utility-theoretic models, we compare pre-trained, instruction-tuned, and RLHF-aligned LLMs, and find that while instruction-tuned models exhibit behaviors consistent with some standard utility formulations, pre-trained and RLHF-aligned models deviate more from any utility models fitted. We further evaluate modulation strategies, including prompt engineering, in-context learning, and post-training, and show that post-training provides the most stable and effective modulation of risk preference. Our findings provide insights into the risk profiles of different classes and stages of LLMs and demonstrate how post-training modulates these profiles, laying the groundwork for future research on behavioral alignment and risk-aware LLM design.

View on arXiv PDF

Similar