MedRule-KG: A Knowledge-Graph--Steered Scaffold for Mathematical Reasoning with a Lightweight Verifier
This addresses the issue of unreliable mathematical reasoning in LLMs for applications requiring safety and accuracy, though it is incremental as it builds on existing verification methods.
The paper tackled the problem of large language models violating mathematical constraints in reasoning tasks by introducing MedRule-KG, a knowledge graph with a symbolic verifier, which improved exact match from 0.767 to 1.000 on a 90-example benchmark.
Large language models (LLMs) often produce fluent reasoning steps while violating simple mathematical or logical constraints. We introduce MedRule-KG, a compact typed knowledge graph coupled with a symbolic verifier, designed to enforce mathematically interpretable rules in reasoning tasks. MedRule-KG encodes entities, relations, and three domain-inspired rules, while the verifier checks predictions and applies minimal corrections to guarantee consistency. On a 90-example FDA-derived benchmark, grounding in MedRule-KG improves exact match (EM) from 0.767 to 0.900, and adding the verifier yields 1.000 EM while eliminating rule violations entirely. We demonstrate how MedRule-KG provides a general scaffold for safe mathematical reasoning, discuss ablations, and release code and data to encourage reproducibility.