LGAICLFeb 3, 2025

Tool Unlearning for Tool-Augmented LLMs

arXiv:2502.01083v212 citationsh-index: 9ICML
Originality Incremental advance
AI Analysis

This addresses a security and maintenance problem for users of tool-augmented LLMs, but it is incremental as it adapts unlearning techniques to a new context.

The paper tackles the problem of enabling tool-augmented large language models to forget specific learned tools due to security, privacy, or deprecation issues, and proposes ToolDelete as the first approach for this task, showing it effectively unlearns tools while preserving other knowledge and general performance in experiments.

Tool-augmented large language models (LLMs) are often trained on datasets of query-response pairs, which embed the ability to use tools or APIs directly into the parametric knowledge of LLMs. Tool-augmented LLMs need the ability to forget learned tools due to security vulnerabilities, privacy regulations, or tool deprecations. However, ``tool unlearning'' has not been investigated in unlearning literature. We introduce this novel task, which requires addressing distinct challenges compared to traditional unlearning: knowledge removal rather than forgetting individual samples, the high cost of optimizing LLMs, and the need for principled evaluation metrics. To bridge these gaps, we propose ToolDelete, the first approach for unlearning tools from tool-augmented LLMs. It implements three key properties to address the above challenges for effective tool unlearning and introduces a new membership inference attack (MIA) model for effective evaluation. Extensive experiments on multiple tool learning datasets and tool-augmented LLMs show that ToolDelete effectively unlearns randomly selected tools, while preserving the LLM's knowledge on non-deleted tools and maintaining performance on general tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes