NCAIOct 16, 2023

Use of probabilistic phrases in a coordination game: human versus GPT-4

arXiv:2310.10544v3h-index: 8
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of evaluating AI's ability to mimic human communication of uncertainty, which is incremental as it applies an existing method to a new comparison.

The study compared human and GPT-4 performance in estimating probabilities and ambiguity of probabilistic phrases in coordination games across investment and medical contexts, finding high agreement in probability estimates (proportions of variance accounted for close to .90) but poorer agreement in ambiguity estimates.

English speakers use probabilistic phrases such as likely to communicate information about the probability or likelihood of events. Communication is successful to the extent that the listener grasps what the speaker means to convey and, if communication is successful, individuals can potentially coordinate their actions based on shared knowledge about uncertainty. We first assessed human ability to estimate the probability and the ambiguity (imprecision) of twenty-three probabilistic phrases in a coordination game in two different contexts, investment advice and medical advice. We then had GPT4 (OpenAI), a Large Language Model, complete the same tasks as the human participants. We found that the median human participant and GPT4 assigned probability estimates that were in good agreement (proportions of variance accounted for close to .90). GPT4's estimates of probability both in the investment and Medical contexts were as close or closer to that of the human participants as the human participants' estimates were to one another. Estimates of probability for both the human participants and GPT4 were little affected by context. In contrast, human and GPT4 estimates of ambiguity were not in such good agreement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes