CLAISep 20, 2024

EmotionQueen: A Benchmark for Evaluating Empathy of Large Language Models

arXiv:2409.13359v134 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the need for better emotional intelligence assessment in LLMs for natural language processing applications, though it is incremental as it builds on existing sentiment analysis research.

The paper tackles the problem of evaluating emotional intelligence in large language models by introducing EmotionQueen, a benchmark with four tasks and two metrics, which reveals significant capabilities and limitations in LLMs' empathy.

Emotional intelligence in large language models (LLMs) is of great importance in Natural Language Processing. However, the previous research mainly focus on basic sentiment analysis tasks, such as emotion recognition, which is not enough to evaluate LLMs' overall emotional intelligence. Therefore, this paper presents a novel framework named EmotionQueen for evaluating the emotional intelligence of LLMs. The framework includes four distinctive tasks: Key Event Recognition, Mixed Event Recognition, Implicit Emotional Recognition, and Intention Recognition. LLMs are requested to recognize important event or implicit emotions and generate empathetic response. We also design two metrics to evaluate LLMs' capabilities in recognition and response for emotion-related statements. Experiments yield significant conclusions about LLMs' capabilities and limitations in emotion intelligence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes