How Emotion Shapes the Behavior of LLMs and Agents: A Mechanistic Study
This addresses the problem of understanding and controlling emotion in AI systems for researchers and developers, offering a mechanistic approach rather than surface-level treatment, though it builds incrementally on existing emotion-aware studies.
The paper tackles the problem of how emotional signals can shape the behavior of large language models (LLMs) and agents, proposing E-STEER, an interpretable emotion steering framework that embeds emotion as a controllable variable in hidden states. The results show that specific emotions enhance LLM capability, improve safety, and systematically shape multi-step agent behaviors, revealing non-monotonic emotion-behavior relations consistent with psychological theories.
Emotion plays an important role in human cognition and performance. Motivated by this, we investigate whether analogous emotional signals can shape the behavior of large language models (LLMs) and agents. Existing emotion-aware studies mainly treat emotion as a surface-level style factor or a perception target, overlooking its mechanistic role in task processing. To address this limitation, we propose E-STEER, an interpretable emotion steering framework that enables direct representation-level intervention in LLMs and agents. It embeds emotion as a structured, controllable variable in hidden states, and with it, we examine the impact of emotion on objective reasoning, subjective generation, safety, and multi-step agent behaviors. The results reveal non-monotonic emotion-behavior relations consistent with established psychological theories, and show that specific emotions not only enhance LLM capability but also improve safety, and systematically shape multi-step agent behaviors.