CLSep 27, 2021

Pragmatic competence of pre-trained language models through the lens of discourse connectives

arXiv:2109.12951v1662 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of understanding language model capabilities for researchers and practitioners, revealing incremental insights into their pragmatic limitations.

The paper investigated the pragmatic competence of pre-trained language models, focusing on discourse connectives, and found that while models perform reasonably on naturally-occurring data, they show low sensitivity to controlled pragmatic cues and lack humanlike temporal preferences, indicating limited pragmatic competence from current pre-training paradigms.

As pre-trained language models (LMs) continue to dominate NLP, it is increasingly important that we understand the depth of language capabilities in these models. In this paper, we target pre-trained LMs' competence in pragmatics, with a focus on pragmatics relating to discourse connectives. We formulate cloze-style tests using a combination of naturally-occurring data and controlled inputs drawn from psycholinguistics. We focus on testing models' ability to use pragmatic cues to predict discourse connectives, models' ability to understand implicatures relating to connectives, and the extent to which models show humanlike preferences regarding temporal dynamics of connectives. We find that although models predict connectives reasonably well in the context of naturally-occurring data, when we control contexts to isolate high-level pragmatic cues, model sensitivity is much lower. Models also do not show substantial humanlike temporal preferences. Overall, the findings suggest that at present, dominant pre-training paradigms do not result in substantial pragmatic competence in our models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes