CLAIFeb 14, 2024

Reasoning over Uncertain Text by Generative Large Language Models

arXiv:2402.09614v34 citationsh-index: 29
AI Analysis

This addresses a critical limitation in LLMs for applications requiring probabilistic reasoning, such as medical decision-making, but is incremental as it builds on existing prompting techniques.

The paper tackles the challenge of improving Large Language Models' (LLMs) probabilistic reasoning over uncertain text by introducing the Bayesian Linguistic Inference Dataset (BLInD) and evaluating prompting strategies that map problems to formal representations like Python code and probabilistic algorithms, showing effectiveness across multiple LLMs.

This paper considers the challenges Large Language Models (LLMs) face when reasoning over text that includes information involving uncertainty explicitly quantified via probability values. This type of reasoning is relevant to a variety of contexts ranging from everyday conversations to medical decision-making. Despite improvements in the mathematical reasoning capabilities of LLMs, they still exhibit significant difficulties when it comes to probabilistic reasoning. To deal with this problem, we introduce the Bayesian Linguistic Inference Dataset (BLInD), a new dataset specifically designed to test the probabilistic reasoning capabilities of LLMs. We use BLInD to find out the limitations of LLMs for tasks involving probabilistic reasoning. In addition, we present several prompting strategies that map the problem to different formal representations, including Python code, probabilistic algorithms, and probabilistic logical programming. We conclude by providing an evaluation of our methods on BLInD and an adaptation of a causal reasoning question-answering dataset. Our empirical results highlight the effectiveness of our proposed strategies for multiple LLMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes