On Fact and Frequency: LLM Responses to Misinformation Expressed with Uncertainty
This addresses the problem of LLM reliability in handling nuanced misinformation for AI safety and fact-checking applications, but it is incremental as it builds on existing uncertainty typologies.
The study investigated how LLMs judge misinformation when it is expressed with uncertainty, finding that they changed their fact-checking classification from false to not-false in 25% of cases after transformation, with a small but significant correlation between fact judgments and frequency estimates.
We study LLM judgments of misinformation expressed with uncertainty. Our experiments study the response of three widely used LLMs (GPT-4o, LlaMA3, DeepSeek-v2) to misinformation propositions that have been verified false and then are transformed into uncertain statements according to an uncertainty typology. Our results show that after transformation, LLMs change their factchecking classification from false to not-false in 25% of the cases. Analysis reveals that the change cannot be explained by predictors to which humans are expected to be sensitive, i.e., modality, linguistic cues, or argumentation strategy. The exception is doxastic transformations, which use linguistic cue phrases such as "It is believed ...".To gain further insight, we prompt the LLM to make another judgment about the transformed misinformation statements that is not related to truth value. Specifically, we study LLM estimates of the frequency with which people make the uncertain statement. We find a small but significant correlation between judgment of fact and estimation of frequency.