CLFeb 19, 2024

Do Pre-Trained Language Models Detect and Understand Semantic Underspecification? Ask the DUST!

arXiv:2402.12486v230 citationsh-index: 9ACL
Originality Incremental advance
AI Analysis

This study addresses a key limitation in current language models' semantic processing for natural language understanding, using a novel dataset to evaluate their capabilities in handling everyday communicative scenarios.

The authors tackled the problem of whether pre-trained language models can detect and understand semantically underspecified sentences, finding that newer models can identify them reasonably well when prompted, but interpreting them correctly is much harder, with models showing little uncertainty contrary to theoretical predictions.

In everyday language use, speakers frequently utter and interpret sentences that are semantically underspecified, namely, whose content is insufficient to fully convey their message or interpret them univocally. For example, to interpret the underspecified sentence "Don't spend too much", which leaves implicit what (not) to spend, additional linguistic context or outside knowledge is needed. In this work, we propose a novel Dataset of semantically Underspecified Sentences grouped by Type (DUST) and use it to study whether pre-trained language models (LMs) correctly identify and interpret underspecified sentences. We find that newer LMs are reasonably able to identify underspecified sentences when explicitly prompted. However, interpreting them correctly is much harder for any LMs. Our experiments show that when interpreting underspecified sentences, LMs exhibit little uncertainty, contrary to what theoretical accounts of underspecification would predict. Overall, our study reveals limitations in current models' processing of sentence semantics and highlights the importance of using naturalistic data and communicative scenarios when evaluating LMs' language capabilities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes