CLJun 16, 2024

Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?

arXiv:2406.11065v215.228 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a key limitation in LLMs for dialogue understanding, with incremental contributions through a new benchmark and evaluation method.

The paper tackles the problem of whether Large Language Models (LLMs) can understand emphasis in dialogue, which indicates speaker intention beyond text, by introducing the Emphasized-Talk benchmark and evaluating various models, finding that commercial LLMs perform better but still have significant room for improvement.

Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue. While Large Language Models (LLMs) have revolutionized natural language processing, their ability to understand emphasis in dialogue remains unclear. This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis. We evaluate various LLMs, both open-source and commercial, to measure their performance in understanding emphasis. Additionally, we propose an automatic evaluation pipeline using GPT-4, which achieves a high correlation with human rating. Our findings reveal that although commercial LLMs generally perform better, there is still significant room for improvement in comprehending emphasized sentences.

View on arXiv PDF Code

Similar