AICLJan 29

NL2LOGIC: AST-Guided Translation of Natural Language into First-Order Logic with Large Language Models

arXiv:2602.13237v11 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the need for accurate and interpretable automated reasoning in domains like law and governance, representing an incremental improvement over existing methods.

The paper tackles the problem of translating natural language into first-order logic for automated reasoning by introducing NL2LOGIC, a framework that uses an abstract syntax tree as an intermediate representation, achieving 99% syntactic accuracy and up to 30% improvement in semantic correctness over state-of-the-art baselines.

Automated reasoning is critical in domains such as law and governance, where verifying claims against facts in documents requires both accuracy and interpretability. Recent work adopts structured reasoning pipelines that translate natural language into first-order logic and delegate inference to automated solvers. With the rise of large language models, approaches such as GCD and CODE4LOGIC leverage their reasoning and code generation capabilities to improve logic parsing. However, these methods suffer from fragile syntax control due to weak enforcement of global grammar constraints and low semantic faithfulness caused by insufficient clause-level semantic understanding. We propose NL2LOGIC, a first-order logic translation framework that introduces an abstract syntax tree as an intermediate representation. NL2LOGIC combines a recursive large language model based semantic parser with an abstract syntax tree guided generator that deterministically produces solver-ready logic code. Experiments on the FOLIO, LogicNLI, and ProofWriter benchmarks show that NL2LOGIC achieves 99 percent syntactic accuracy and improves semantic correctness by up to 30 percent over state-of-the-art baselines. Furthermore, integrating NL2LOGIC into Logic-LM yields near-perfect executability and improves downstream reasoning accuracy by 31 percent compared to Logic-LM's original few-shot unconstrained translation module.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes