CLJun 4, 2024

Eliciting the Priors of Large Language Models using Iterated In-Context Learning

arXiv:2406.01860v118 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need to interpret LLM decision-making for deployment in real-world settings, though it is incremental as it applies an existing method to new models.

The authors tackled the problem of understanding the implicit knowledge in Large Language Models by developing a prompt-based workflow to elicit Bayesian prior distributions using iterated in-context learning, and found that priors from GPT-4 qualitatively align with human priors in tasks like causal learning and proportion estimation.

As Large Language Models (LLMs) are increasingly deployed in real-world settings, understanding the knowledge they implicitly use when making decisions is critical. One way to capture this knowledge is in the form of Bayesian prior distributions. We develop a prompt-based workflow for eliciting prior distributions from LLMs. Our approach is based on iterated learning, a Markov chain Monte Carlo method in which successive inferences are chained in a way that supports sampling from the prior distribution. We validated our method in settings where iterated learning has previously been used to estimate the priors of human participants -- causal learning, proportion estimation, and predicting everyday quantities. We found that priors elicited from GPT-4 qualitatively align with human priors in these settings. We then used the same method to elicit priors from GPT-4 for a variety of speculative events, such as the timing of the development of superhuman AI.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes