CL AIMar 7, 2025

An Empirical Study of Conformal Prediction in LLM with ASP Scaffolds for Robust Reasoning

Navdeep Kaur, Lachlan McPheat, Alessandra Russo, Anthony G Cohn, Pranava Madhyastha

arXiv:2503.05439v23 citationsh-index: 14

Originality Incremental advance

AI Analysis

This work addresses robust reasoning in LLMs for spatial reasoning tasks, but it is incremental as it builds on existing CLM and ASP methods with noted limitations in handling more complex reasoning steps.

The paper tackled the problem of improving the performance of open-weight LLMs on complex multi-step reasoning tasks by using Conformal Language Modelling (CLM) with Answer Set Programming (ASP) scaffolds, achieving significant accuracy improvements over baseline models on the StepGame dataset.

In this paper, we examine the use of Conformal Language Modelling (CLM) alongside Answer Set Programming (ASP) to enhance the performance of standard open-weight LLMs on complex multi-step reasoning tasks. Using the StepGame dataset, which requires spatial reasoning, we apply CLM to generate sets of ASP programs from an LLM, providing statistical guarantees on the correctness of the outputs. Experimental results show that CLM significantly outperforms baseline models that use standard sampling methods, achieving substantial accuracy improvements across different levels of reasoning complexity. Additionally, the LLM-as-Judge metric enhances CLM's performance, especially in assessing structurally and logically correct ASP outputs. However, calibrating CLM with diverse calibration sets did not improve generalizability for tasks requiring much longer reasoning steps, indicating limitations in handling more complex tasks.

View on arXiv PDF

Similar