CLMay 8

SCENE: Recognizing Social Norms and Sanctioning in Group Chats

arXiv:2605.0782319.0
Predicted impact top 32% in CL · last 90 daysOriginality Incremental advance
AI Analysis

For AI safety and social AI research, this provides a dynamic evaluation of LLMs' social norm adaptation, addressing a gap in interactional testing.

The paper introduces SCENE, a benchmark for evaluating LLMs' ability to recognize and adapt to implicit social norms in group chats. Results show Claude Opus 4.7 and Gemini 3.1 Pro adapt significantly better than open-weight models.

Online group chats are social spaces with implicit behavior patterns that, when broken, are often met with social sanctioning from the group. The ability and willingness of LLM-based agents to recognize and adapt to these norms remains mostly unexplored. We introduce SCENE, a social-interaction benchmark focused on implicit norms and social sanctioning in multi-party chat. SCENE generates plausible non-roleplay scenarios with scripted personas that follow a hidden norm, create opportunities for the subject agent to violate it, and sanction breaches when they occur. We further propose behavioral evaluation metrics for two functional adaptation abilities: responsiveness to negative sanctioning, and adapting norm from peers behavior. We evaluate six frontier and open-weight models on SCENE. Our results show that Claude Opus 4.7 and Gemini 3.1 Pro adapt to implicit norms significantly more than the evaluated open-weight models. SCENE contributes one benchmark in the direction of recent calls for dynamic, interactional evaluation of LLM social capabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes