CLOct 10, 2025

WUGNECTIVES: Novel Entity Inferences of Language Models from Discourse Connectives

Daniel Brubaker, William Sheffield, Junyi Jessy Li, Kanishka Misra

arXiv:2510.09556v24.91 citationsh-index: 3Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of enhancing language models' world knowledge through discourse cues, which is incremental as it builds on existing research on discourse relations.

The authors tackled the problem of whether discourse connectives can inform language models about world knowledge by creating the WUGNECTIVES dataset of 8,880 stimuli to evaluate inferences about novel entities, finding that tuning models for reasoning improved performance on most connectives, but all models struggled with concessive connectives.

The role of world knowledge has been particularly crucial to predict the discourse connective that marks the discourse relation between two arguments, with language models (LMs) being generally successful at this task. We flip this premise in our work, and instead study the inverse problem of understanding whether discourse connectives can inform LMs about the world. To this end, we present WUGNECTIVES, a dataset of 8,880 stimuli that evaluates LMs' inferences about novel entities in contexts where connectives link the entities to particular attributes. On investigating 17 different LMs at various scales, and training regimens, we found that tuning an LM to show reasoning behavior yields noteworthy improvements on most connectives. At the same time, there was a large variation in LMs' overall performance across connective type, with all models systematically struggling on connectives that express a concessive meaning. Our findings pave the way for more nuanced investigations into the functional role of language cues as captured by LMs. We release WUGNECTIVES at https://github.com/sheffwb/wugnectives.

View on arXiv PDF Code

Similar