Emotions Where Art Thou: Understanding and Characterizing the Emotional Latent Space of Large Language Models
This work provides insight into the emotional processing of LLMs, which could inform applications in affective computing and human-AI interaction, though it is incremental in characterizing existing models.
The paper investigated how large language models internally represent emotion by analyzing their hidden-state geometry, identifying a low-dimensional emotional manifold that is stable across layers and generalizes to eight datasets in five languages with low error and strong linear probe performance.
This work investigates how large language models (LLMs) internally represent emotion by analyzing the geometry of their hidden-state space. The paper identifies a low-dimensional emotional manifold and shows that emotional representations are directionally encoded, distributed across layers, and aligned with interpretable dimensions. These structures are stable across depth and generalize to eight real-world emotion datasets spanning five languages. Cross-domain alignment yields low error and strong linear probe performance, indicating a universal emotional subspace. Within this space, internal emotion perception can be steered while preserving semantics using a learned intervention module, with especially strong control for basic emotions across languages. These findings reveal a consistent and manipulable affective geometry in LLMs and offer insight into how they internalize and process emotion.