Enhancing LLM Tool Use with High-quality Instruction Data from Knowledge Graph
This work addresses the problem of insufficient instruction data quality for LLM tool use, which is crucial for enhancing their problem-solving abilities and expanding applications, representing an incremental advancement over previous methods.
The paper tackles the challenge of teaching large language models (LLMs) to use tools effectively by proposing a method that generates high-quality instruction data from knowledge graphs, resulting in significant improvements in tool utilization and overall capabilities with fine-tuning on a small sample of synthetic data.
Teaching large language models (LLMs) to use tools is crucial for improving their problem-solving abilities and expanding their applications. However, effectively using tools is challenging because it requires a deep understanding of tool functionalities and user intentions. Previous methods relied mainly on LLMs to generate instruction data, but the quality of these data was often insufficient. In this paper, we propose a new method that uses knowledge graphs to generate high-quality instruction data for LLMs. Knowledge graphs are manually curated datasets rich in semantic information. We begin by extracting various query pathways from a given knowledge graph, which are transformed into a broad spectrum of user queries. We then translate the relationships between entities into actionable tools and parse the pathways of each query into detailed solution steps, thereby creating high-quality instruction data. Our experiments show that fine-tuning on just a small sample of this synthetic data can significantly improve the tool utilization and overall capabilities of LLMs.