Language Models Predict Empathy Gaps Between Social In-groups and Out-groups
This reveals a potential bias in LLMs that could affect applications in social contexts, but it is incremental as it extends known human psychology findings to AI.
The study investigated whether large language models (LLMs) replicate human empathy gaps by predicting higher emotion intensities for in-group members than out-group members in an emotion intensity prediction task, finding this bias across race/ethnicity, nationality, and religion, with Llama-3.1-8B showing the strongest bias.
Studies of human psychology have demonstrated that people are more motivated to extend empathy to in-group members than out-group members (Cikara et al., 2011). In this study, we investigate how this aspect of intergroup relations in humans is replicated by LLMs in an emotion intensity prediction task. In this task, the LLM is given a short description of an experience a person had that caused them to feel a particular emotion; the LLM is then prompted to predict the intensity of the emotion the person experienced on a numerical scale. By manipulating the group identities assigned to the LLM's persona (the "perceiver") and the person in the narrative (the "experiencer"), we measure how predicted emotion intensities differ between in-group and out-group settings. We observe that LLMs assign higher emotion intensity scores to in-group members than out-group members. This pattern holds across all three types of social groupings we tested: race/ethnicity, nationality, and religion. We perform an in-depth analysis on Llama-3.1-8B, the model which exhibited strongest intergroup bias among those tested.