Logic Sketch Prompting (LSP): A Deterministic and Interpretable Prompting Method
This addresses the need for deterministic and interpretable AI in clinical and safety-critical systems, though it is an incremental improvement over existing prompting techniques.
The paper tackled the problem of LLMs being unreliable on tasks requiring strict rule adherence and determinism by introducing Logic Sketch Prompting (LSP), which achieved the highest accuracy (0.83 to 0.89) and F1 scores across models and tasks, outperforming other prompting methods.
Large language models (LLMs) excel at natural language reasoning but remain unreliable on tasks requiring strict rule adherence, determinism, and auditability. Logic Sketch Prompting (LSP) is a lightweight prompting framework that introduces typed variables, deterministic condition evaluators, and a rule based validator that produces traceable and repeatable outputs. Using two pharmacologic logic compliance tasks, we benchmark LSP against zero shot prompting, chain of thought prompting, and concise prompting across three open weight models: Gemma 2, Mistral, and Llama 3. Across both tasks and all models, LSP consistently achieves the highest accuracy (0.83 to 0.89) and F1 score (0.83 to 0.89), substantially outperforming zero shot prompting (0.24 to 0.60), concise prompts (0.16 to 0.30), and chain of thought prompting (0.56 to 0.75). McNemar tests show statistically significant gains for LSP across nearly all comparisons (p < 0.01). These results demonstrate that LSP improves determinism, interpretability, and consistency without sacrificing performance, supporting its use in clinical, regulated, and safety critical decision support systems.