CR LGFeb 20, 2023

Prompt Stealing Attacks Against Text-to-Image Generation Models

Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang

arXiv:2302.09923v226.171 citationsh-index: 72Has Code

Originality Highly original

AI Analysis

This addresses a threat to the intellectual property of prompt engineers and the business model of prompt marketplaces, representing a novel attack vector in the text-to-image ecosystem.

The paper introduces a prompt stealing attack that extracts text prompts from images generated by text-to-image models, demonstrating that their method, PromptStealer, outperforms three baseline methods both quantitatively and qualitatively.

Text-to-Image generation models have revolutionized the artwork design process and enabled anyone to create high-quality images by entering text descriptions called prompts. Creating a high-quality prompt that consists of a subject and several modifiers can be time-consuming and costly. In consequence, a trend of trading high-quality prompts on specialized marketplaces has emerged. In this paper, we perform the first study on understanding the threat of a novel attack, namely prompt stealing attack, which aims to steal prompts from generated images by text-to-image generation models. Successful prompt stealing attacks directly violate the intellectual property of prompt engineers and jeopardize the business model of prompt marketplaces. We first perform a systematic analysis on a dataset collected by ourselves and show that a successful prompt stealing attack should consider a prompt's subject as well as its modifiers. Based on this observation, we propose a simple yet effective prompt stealing attack, PromptStealer. It consists of two modules: a subject generator trained to infer the subject and a modifier detector for identifying the modifiers within the generated image. Experimental results demonstrate that PromptStealer is superior over three baseline methods, both quantitatively and qualitatively. We also make some initial attempts to defend PromptStealer. In general, our study uncovers a new attack vector within the ecosystem established by the popular text-to-image generation models. We hope our results can contribute to understanding and mitigating this emerging threat.

View on arXiv PDF Code

Similar