CLLGSep 18, 2024

Finetuning Language Models to Emit Linguistic Expressions of Uncertainty

arXiv:2409.12180v121 citationsh-index: 14
Originality Incremental advance
AI Analysis

This addresses the issue of user trust and reliability in LLM outputs for information-seeking and decision-making tasks, but it is incremental as it builds on existing finetuning methods.

The paper tackled the problem of large language models (LLMs) generating inaccurate but confidently presented information, which misleads users, by using supervised finetuning to make LLMs produce linguistic expressions of uncertainty. The result showed that LLMs are well-calibrated in their predictions, and finetuning based on model confidence leads to well-calibrated uncertainty expressions, especially for single-claim answers.

Large language models (LLMs) are increasingly employed in information-seeking and decision-making tasks. Despite their broad utility, LLMs tend to generate information that conflicts with real-world facts, and their persuasive style can make these inaccuracies appear confident and convincing. As a result, end-users struggle to consistently align the confidence expressed by LLMs with the accuracy of their predictions, often leading to either blind trust in all outputs or a complete disregard for their reliability. In this work, we explore supervised finetuning on uncertainty-augmented predictions as a method to develop models that produce linguistic expressions of uncertainty. Specifically, we measure the calibration of pre-trained models and then fine-tune language models to generate calibrated linguistic expressions of uncertainty. Through experiments on various question-answering datasets, we demonstrate that LLMs are well-calibrated in assessing their predictions, and supervised finetuning based on the model's own confidence leads to well-calibrated expressions of uncertainty, particularly for single-claim answers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes