EMODIS: A Benchmark for Context-Dependent Emoji Disambiguation in Large Language Models
This addresses the challenge of context-dependent ambiguity in LLMs for real-world communication, though it is incremental as it introduces a new benchmark rather than a novel method.
The authors tackled the problem of evaluating large language models' ability to interpret ambiguous emoji expressions in context, and found that even top models frequently fail to distinguish meanings with subtle cues, revealing systematic biases and limited pragmatic sensitivity.
Large language models (LLMs) are increasingly deployed in real-world communication settings, yet their ability to resolve context-dependent ambiguity remains underexplored. In this work, we present EMODIS, a new benchmark for evaluating LLMs' capacity to interpret ambiguous emoji expressions under minimal but contrastive textual contexts. Each instance in EMODIS comprises an ambiguous sentence containing an emoji, two distinct disambiguating contexts that lead to divergent interpretations, and a specific question that requires contextual reasoning. We evaluate both open-source and API-based LLMs, and find that even the strongest models frequently fail to distinguish meanings when only subtle contextual cues are present. Further analysis reveals systematic biases toward dominant interpretations and limited sensitivity to pragmatic contrast. EMODIS provides a rigorous testbed for assessing contextual disambiguation, and highlights the gap in semantic reasoning between humans and LLMs.