CVAIJan 26, 2025

CE-SDWV: Effective and Efficient Concept Erasure for Text-to-Image Diffusion Models via a Semantic-Driven Word Vocabulary

arXiv:2501.15562v27 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This addresses privacy and safety concerns in generative AI by enabling concept removal, though it is an incremental improvement over existing erasure methods.

The paper tackles the problem of removing undesirable concepts like NSFW content from text-to-image diffusion models by proposing the CE-SDWV framework, which adjusts text condition tokens without retraining the model, achieving effective and efficient erasure as demonstrated on benchmarks like I2P and UnlearnCanvas.

Large-scale text-to-image (T2I) diffusion models have achieved remarkable generative performance about various concepts. With the limitation of privacy and safety in practice, the generative capability concerning NSFW (Not Safe For Work) concepts is undesirable, e.g., producing sexually explicit photos, and licensed images. The concept erasure task for T2I diffusion models has attracted considerable attention and requires an effective and efficient method. To achieve this goal, we propose a CE-SDWV framework, which removes the target concepts (e.g., NSFW concepts) of T2I diffusion models in the text semantic space by only adjusting the text condition tokens and does not need to re-train the original T2I diffusion model's weights. Specifically, our framework first builds a target concept-related word vocabulary to enhance the representation of the target concepts within the text semantic space, and then utilizes an adaptive semantic component suppression strategy to ablate the target concept-related semantic information in the text condition tokens. To further adapt the above text condition tokens to the original image semantic space, we propose an end-to-end gradient-orthogonal token optimization strategy. Extensive experiments on I2P and UnlearnCanvas benchmarks demonstrate the effectiveness and efficiency of our method. Code is available at https://github.com/TtuHamg/CE-SDWV.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes