CRJun 2

SEEM: Exploiting Black-Box Text Attacks to Manipulate Tool Selection

arXiv:2504.0480979.82 citations
Predicted impact top 12% in CR · last 90 daysOriginality Incremental advance
AI Analysis

It exposes a previously overlooked vulnerability in LLM tool selection, highlighting security risks for developers and users relying on tool-augmented LLMs.

The paper introduces SEEM, a black-box text attack that manipulates tool selection in LLMs by perturbing tool descriptions, significantly increasing the target tool's selection probability. Experiments show the attack effectively raises the target tool's ranking among candidates.

Tool learning has emerged as a powerful auxiliary mechanism that extends the capabilities of large language models (LLMs), enabling them to address complex tasks that demand real-time relevance or high-precision operations. However, beneath this strength lie significant security risks. Prior studies have primarily concentrated on corrupting the outputs of invoked tools, while largely overlooking the vulnerability of the tool selection process itself. To bridge this gap, we introduce a black-box, text-based attack that substantially increases the likelihood of a target tool being selected. We propose SEEM, a two-level coarse-to-fine perturbation method that operates at both the word and character levels. Through comprehensive experiments, we show that merely perturbing the textual information of tools can markedly raise the probability of the target tool being prioritized and ranked higher among candidates. Our findings expose critical weaknesses in the tool selection mechanism and lay the groundwork for developing defenses to secure this essential process.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes