CLDec 5, 2024

Reducing Tool Hallucination via Reliability Alignment

arXiv:2412.04141v329 citationsh-index: 16ICML
Originality Incremental advance
AI Analysis

This addresses a critical issue for developers and users of LLM-based automation systems by enhancing system reliability, though it appears incremental as it builds on existing tool-use capabilities.

The paper tackled the problem of tool hallucinations in Large Language Models, where models select inappropriate tools or misuse them, and proposed a reliability alignment framework called Relign that significantly reduces these hallucinations, improving task reliability and efficiency.

Large Language Models (LLMs) have expanded their capabilities beyond language generation to interact with external tools, enabling automation and real-world applications. However, tool hallucinations, where models either select inappropriate tools or misuse them, pose significant challenges, leading to erroneous task execution, increased computational costs, and reduced system reliability. To systematically address this issue, we define and categorize tool hallucinations into two main types, tool selection hallucination and tool usage hallucination. To evaluate and mitigate these issues, we introduce RelyToolBench, which integrates specialized test cases and novel metrics to assess hallucination-aware task success and efficiency. Finally, we propose Relign, a reliability alignment framework that expands the tool-use action space to include indecisive actions, allowing LLMs to defer tool use, seek clarification, or adjust tool selection dynamically. Through extensive experiments, we demonstrate that Relign significantly reduces tool hallucinations, improves task reliability, and enhances the efficiency of LLM tool interactions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes