GuardDoor: Safeguarding Against Malicious Diffusion Editing via Protective Backdoors
This addresses security concerns for image owners and model providers against malicious edits like misinformation, though it is incremental as it builds on existing backdoor and adversarial methods.
The paper tackles the problem of unauthorized image editing by diffusion models, proposing GuardDoor, a cooperative protection mechanism that embeds imperceptible triggers in images to cause meaningless outputs when edited, enhancing robustness against preprocessing techniques.
The growing accessibility of diffusion models has revolutionized image editing but also raised significant concerns about unauthorized modifications, such as misinformation and plagiarism. Existing countermeasures largely rely on adversarial perturbations designed to disrupt diffusion model outputs. However, these approaches are found to be easily neutralized by simple image preprocessing techniques, such as compression and noise addition. To address this limitation, we propose GuardDoor, a novel and robust protection mechanism that fosters collaboration between image owners and model providers. Specifically, the model provider participating in the mechanism fine-tunes the image encoder to embed a protective backdoor, allowing image owners to request the attachment of imperceptible triggers to their images. When unauthorized users attempt to edit these protected images with this diffusion model, the model produces meaningless outputs, reducing the risk of malicious image editing. Our method demonstrates enhanced robustness against image preprocessing operations and is scalable for large-scale deployment. This work underscores the potential of cooperative frameworks between model providers and image owners to safeguard digital content in the era of generative AI.