LLM-Prior: A Framework for Knowledge-Driven Prior Elicitation and Aggregation
This work addresses the problem of inefficient prior elicitation for Bayesian modelers, offering a novel automated approach that could lower barriers to sophisticated modeling.
The paper tackles the bottleneck of manual and subjective prior specification in Bayesian inference by proposing LLM-Prior, a framework that uses Large Language Models to automate the translation of unstructured contexts into valid probability distributions, achieving scalable prior elicitation and aggregation in multi-agent systems.
The specification of prior distributions is fundamental in Bayesian inference, yet it remains a significant bottleneck. The prior elicitation process is often a manual, subjective, and unscalable task. We propose a novel framework which leverages Large Language Models (LLMs) to automate and scale this process. We introduce \texttt{LLMPrior}, a principled operator that translates rich, unstructured contexts such as natural language descriptions, data or figures into valid, tractable probability distributions. We formalize this operator by architecturally coupling an LLM with an explicit, tractable generative model, such as a Gaussian Mixture Model (forming a LLM based Mixture Density Network), ensuring the resulting prior satisfies essential mathematical properties. We further extend this framework to multi-agent systems where Logarithmic Opinion Pooling is employed to aggregate prior distributions induced by decentralized knowledge. We present the federated prior aggregation algorithm, \texttt{Fed-LLMPrior}, for aggregating distributed, context-dependent priors in a manner robust to agent heterogeneity. This work provides the foundation for a new class of tools that can potentially lower the barrier to entry for sophisticated Bayesian modeling.