AICLJul 11, 2024

On the attribution of confidence to large language models

arXiv:2407.08388v18 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

This addresses a foundational issue in AI evaluation for researchers and philosophers, but it is incremental as it builds on existing debates without introducing new empirical data or methods.

The paper tackles the theoretical basis for attributing degrees of confidence (credences) to large language models, arguing that such attributions are literal and plausible but subject to skepticism due to potential flaws in experimental methods.

Credences are mental states corresponding to degrees of confidence in propositions. Attribution of credences to Large Language Models (LLMs) is commonplace in the empirical literature on LLM evaluation. Yet the theoretical basis for LLM credence attribution is unclear. We defend three claims. First, our semantic claim is that LLM credence attributions are (at least in general) correctly interpreted literally, as expressing truth-apt beliefs on the part of scientists that purport to describe facts about LLM credences. Second, our metaphysical claim is that the existence of LLM credences is at least plausible, although current evidence is inconclusive. Third, our epistemic claim is that LLM credence attributions made in the empirical literature on LLM evaluation are subject to non-trivial sceptical concerns. It is a distinct possibility that even if LLMs have credences, LLM credence attributions are generally false because the experimental techniques used to assess LLM credences are not truth-tracking.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes