CLAISep 17, 2025

Correct-Detect: Balancing Performance and Ambiguity Through the Lens of Coreference Resolution in LLMs

arXiv:2509.14456v22 citationsh-index: 2EMNLP
Originality Incremental advance
AI Analysis

This addresses a foundational linguistic challenge for LLM applications, but it is incremental as it builds on known capabilities without solving the trade-off.

The paper tackles the problem of LLMs balancing coreference resolution and ambiguity detection, showing they perform well individually but cannot do both simultaneously, establishing a trade-off termed CORRECT-DETECT.

Large Language Models (LLMs) are intended to reflect human linguistic competencies. But humans have access to a broad and embodied context, which is key in detecting and resolving linguistic ambiguities, even in isolated text spans. A foundational case of semantic ambiguity is found in the task of coreference resolution: how is a pronoun related to an earlier person mention? This capability is implicit in nearly every downstream task, and the presence of ambiguity at this level can alter performance significantly. We show that LLMs can achieve good performance with minimal prompting in both coreference disambiguation and the detection of ambiguity in coreference, however, they cannot do both at the same time. We present the CORRECT-DETECT trade-off: though models have both capabilities and deploy them implicitly, successful performance balancing these two abilities remains elusive.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes