CL AIFeb 28, 2025

Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems

Jędrzej Warczyński, Mateusz Lango, Ondrej Dusek

arXiv:2502.20609v123 citationsh-index: 30INLG

Originality Incremental advance

AI Analysis

This work addresses the need for efficient and interpretable data-to-text generation for applications requiring transparency and speed, though it is incremental as it builds on existing LLM and rule-based methods.

The authors tackled the problem of generating high-quality text from structured data by using a large language model to automatically create an interpretable rule-based system in Python, achieving better BLEU and BLEURT scores than direct LLM prompting and fewer hallucinations than a fine-tuned BART model, with faster runtime on a single CPU.

We introduce a simple approach that uses a large language model (LLM) to automatically implement a fully interpretable rule-based data-to-text system in pure Python. Experimental evaluation on the WebNLG dataset showed that such a constructed system produces text of better quality (according to the BLEU and BLEURT metrics) than the same LLM prompted to directly produce outputs, and produces fewer hallucinations than a BART language model fine-tuned on the same data. Furthermore, at runtime, the approach generates text in a fraction of the processing time required by neural approaches, using only a single CPU

View on arXiv PDF

Similar