AILOFeb 17, 2025

Logic.py: Bridging the Gap between LLMs and Constraint Solvers

arXiv:2502.15776v16 citationsh-index: 7Has Code
Originality Highly original
AI Analysis

This addresses the problem of LLMs struggling with complex reasoning tasks by integrating them with specialized tools, offering a significant but domain-specific advancement.

The paper tackles the challenge of solving search-based logic puzzles with large language models by introducing a method where the LLM formalizes problems in a domain-specific language (Logic.py) that is then solved by a constraint solver. This approach achieves a 65% absolute improvement over the baseline, reaching over 90% accuracy on the ZebraLogicBench benchmark.

We present a novel approach to formalise and solve search-based problems using large language models, which significantly improves upon previous state-of-the-art results. We demonstrate the efficacy of this approach on the logic puzzles benchmark ZebraLogicBench. Instead of letting the LLM attempt to directly solve the puzzles, our method prompts the model to formalise the problem in a logic-focused domain-specific language (DSL) called Logic.py. This formalised representation is then solved using a constraint solver, leveraging the strengths of both the language model and the solver. Our approach achieves a remarkable 65% absolute improvement over the baseline performance of Llama 3.1 70B on ZebraLogicBench, setting a new state-of-the-art with an accuracy of over 90%. This significant advancement demonstrates the potential of combining language models with domain-specific languages and auxiliary tools on traditionally challenging tasks for LLMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes