CLDec 12, 2024

Learning to Solve Domain-Specific Calculation Problems with Knowledge-Intensive Programs Generator

Chengyuan Liu, Shihang Wang, Lizhi Qing, Jun Lin, Ji Zhang, Fei Wu, Kun Kuang

arXiv:2412.09280v18.211 citationsh-index: 26NAACL

Originality Incremental advance

AI Analysis

This work addresses the problem of automating knowledge-intensive calculations in specialized fields like law, offering an incremental improvement by enhancing logic consistency through iterative alignment.

The paper tackles the challenge of solving domain-specific calculation problems that require complex rules and knowledge, which are difficult for large language models, by proposing a pipeline called KIPG that generates knowledge-intensive programs from domain documents and iteratively aligns them for logic consistency. In experiments within the legal domain, the method proved effective and was found adaptable to other domains without retraining.

Domain Large Language Models (LLMs) are developed for domain-specific tasks based on general LLMs. But it still requires professional knowledge to facilitate the expertise for some domain-specific tasks. In this paper, we investigate into knowledge-intensive calculation problems. We find that the math problems to be challenging for LLMs, when involving complex domain-specific rules and knowledge documents, rather than simple formulations of terminologies. Therefore, we propose a pipeline to solve the domain-specific calculation problems with Knowledge-Intensive Programs Generator more effectively, named as KIPG. It generates knowledge-intensive programs according to the domain-specific documents. For each query, key variables are extracted, then outcomes which are dependent on domain knowledge are calculated with the programs. By iterative preference alignment, the code generator learns to improve the logic consistency with the domain knowledge. Taking legal domain as an example, we have conducted experiments to prove the effectiveness of our pipeline, and extensive analysis on the modules. We also find that the code generator is also adaptable to other domains, without training on the new knowledge.

View on arXiv PDF

Similar