CLJun 15, 2023

DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning

arXiv:2306.09030v219 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses the problem of developing communicative social agents by providing a cohesive framework for general pragmatic understanding, though it is incremental as it builds on prior work on figurative expressions.

The authors introduced DiPlomat, a dataset of 4,177 multi-turn dialogues to benchmark machines on pragmatic reasoning and situated conversational understanding, finding that large language models perform poorly in this subjective domain.

Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations and is essential for the development of communicative social agents. In this paper, we introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding. Compared with previous works that treat different figurative expressions (e.g. metaphor, sarcasm) as individual tasks, DiPlomat provides a cohesive framework towards general pragmatic understanding. Our dataset is created through the utilization of Amazon Mechanical Turk ( AMT ), resulting in a total of 4, 177 multi-turn dialogues. In conjunction with the dataset, we propose two tasks, Pragmatic Identification and Reasoning (PIR) and Conversational Question Answering (CQA). Experimental results with state-of-the-art (SOTA) neural architectures reveal several significant findings: 1) large language models ( LLMs) exhibit poor performance in tackling this subjective domain; 2) comprehensive comprehension of context emerges as a critical factor for establishing benign human-machine interactions; 3) current models defect in the application of pragmatic reasoning. As a result, we call on more attention to improve the ability of context understanding, reasoning, and implied meaning modeling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes