CL AI LGFeb 5, 2024

Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models

Anthony Sicilia, Hyunwoo Kim, Khyathi Raghavi Chandu, Malihe Alikhani, Jack Hessel

AI2NVIDIA

arXiv:2402.03284v114.627 citationsh-index: 31Has CodeACL

Originality Incremental advance

AI Analysis

This work addresses the challenge of representing inherent uncertainty in dialogues for applications like negotiation systems, though it is incremental as it builds on existing forecasting tasks with new metrics and fine-tuning methods.

The paper tackled the problem of forecasting uncertainty in conversations by expanding the conversation forecasting task to include uncertainty-aware metrics, enabling abstention on uncertain instances. The result showed that fine-tuning strategies could calibrate smaller open-source models to compete with pre-trained models 10 times their size, as demonstrated on eight negotiation corpora.

Effective interlocutors account for the uncertain goals, beliefs, and emotions of others. But even the best human conversationalist cannot perfectly anticipate the trajectory of a dialogue. How well can language models represent inherent uncertainty in conversations? We propose FortUne Dial, an expansion of the long-standing "conversation forecasting" task: instead of just accuracy, evaluation is conducted with uncertainty-aware metrics, effectively enabling abstention on individual instances. We study two ways in which language models potentially represent outcome uncertainty (internally, using scores and directly, using tokens) and propose fine-tuning strategies to improve calibration of both representations. Experiments on eight difficult negotiation corpora demonstrate that our proposed fine-tuning strategies (a traditional supervision strategy and an off-policy reinforcement learning strategy) can calibrate smaller open-source models to compete with pre-trained models 10x their size.

View on arXiv PDF

Similar