CRJun 2

SEEM: Exploiting Black-Box Text Attacks to Manipulate Tool Selection

Liuji Chen, Hao Gao, Jinghao Zhang, Qiang Liu, Shu Wu, Liang Wang

arXiv:2504.0480979.82 citations

Predicted impact top 12% in CR · last 90 daysOriginality Incremental advance

AI Analysis

It exposes a previously overlooked vulnerability in LLM tool selection, highlighting security risks for developers and users relying on tool-augmented LLMs.

The paper introduces SEEM, a black-box text attack that manipulates tool selection in LLMs by perturbing tool descriptions, significantly increasing the target tool's selection probability. Experiments show the attack effectively raises the target tool's ranking among candidates.

Tool learning has emerged as a powerful auxiliary mechanism that extends the capabilities of large language models (LLMs), enabling them to address complex tasks that demand real-time relevance or high-precision operations. However, beneath this strength lie significant security risks. Prior studies have primarily concentrated on corrupting the outputs of invoked tools, while largely overlooking the vulnerability of the tool selection process itself. To bridge this gap, we introduce a black-box, text-based attack that substantially increases the likelihood of a target tool being selected. We propose SEEM, a two-level coarse-to-fine perturbation method that operates at both the word and character levels. Through comprehensive experiments, we show that merely perturbing the textual information of tools can markedly raise the probability of the target tool being prioritized and ranked higher among candidates. Our findings expose critical weaknesses in the tool selection mechanism and lay the groundwork for developing defenses to secure this essential process.

View on arXiv PDF

Similar