It Couldn't Help But Overhear: On the Limits of Modelling Meta-Communicative Grounding Acts with Supervised Learning
This highlights a fundamental limitation in current NLP dialogue models for researchers and practitioners, as it questions the effectiveness of relying on overhearing data for building conversational AI, making it a foundational critique rather than an incremental improvement.
The paper tackles the problem of modeling human meta-communicative grounding acts in NLP dialogue models using the overhearing paradigm, showing evidence that it may be impossible to properly capture these processes with data-driven learning models, and provides a preliminary analysis on the variability of human clarification requests.
Active participation in a conversation is key to building common ground, since understanding is jointly tailored by producers and recipients. Overhearers are deprived of the privilege of performing grounding acts and can only conjecture about intended meanings. Still, data generation and annotation, modelling, training and evaluation of NLP dialogue models place reliance on the overhearing paradigm. How much of the underlying grounding processes are thereby forfeited? As we show, there is evidence pointing to the impossibility of properly modelling human meta-communicative acts with data-driven learning models. In this paper, we discuss this issue and provide a preliminary analysis on the variability of human decisions for requesting clarification. Most importantly, we wish to bring this topic back to the community's table, encouraging discussion on the consequences of having models designed to only "listen in".