CLApr 18

Auditing Support Strategies in LLMs through Grounded Multi-Turn Social Simulation

arXiv:2604.1707931.11 citationsh-index: 9
AI Analysis

For researchers and developers auditing LLMs in socially sensitive applications, this work provides a method to uncover trajectory-level support dynamics that single-turn evaluations miss.

The paper introduces a multi-turn simulation framework to evaluate how LLMs provide social support, revealing that support strategies shift with estimated user distress—teaching decreases as distress rises—across two models and over 6,200 turns, a pattern invisible to single-turn evaluations.

When users seek social support from chatbots, they disclose their situation gradually, yet most evaluations of supportive LLMs rely on single-turn, fully specified prompts. We introduce a multi-turn simulation framework that closes this gap. Support-seeking narratives from five Reddit communities are decomposed into ordered fragments and revealed turn by turn to a language model. Each response is coded with the Social Support Behavior Code (SSBC), an established multi-label taxonomy that captures the composition of support, rather than a single quality score. To ask whether support choices track the model's own construal of user distress, we use linear probes on hidden representations to estimate this internal signal without altering the generation context. Across two mid-scale models (Llama-3.1-8B, OLMo-3-7B) and more than 6,200 turns, support composition shifts systematically with estimated distress: teaching declines as estimated distress rises, a finding that replicates across architectures, while increases in affective and esteem-oriented strategies (such as validation) are suggestive but model-specific and rest on noisier annotations. Community context independently shapes behavior, tracking topic and discourse norms rather than demographic categories. These trajectory-level dynamics, invisible to single-turn evaluation, motivate multi-turn auditing frameworks for socially sensitive applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes