CVAINov 21, 2024

Safety Without Semantic Disruptions: Editing-free Safe Image Generation via Context-preserving Dual Latent Reconstruction

arXiv:2411.13982v21 citationsh-index: 32Has Code2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Originality Incremental advance
AI Analysis

This addresses safety concerns in generative AI for users by preventing harmful content without disrupting learned manifolds, though it is incremental as it builds on existing model editing techniques.

The paper tackles the problem of harmful or inappropriate outputs in multimodal generative models by proposing an editing-free method that uses safe embeddings and a modified diffusion process to generate safer images while preserving semantic context, achieving state-of-the-art results on benchmarks with intuitive safety control.

Training multimodal generative models on large, uncurated datasets can result in users being exposed to harmful, unsafe and controversial or culturally-inappropriate outputs. While model editing has been proposed to remove or filter undesirable concepts in embedding and latent spaces, it can inadvertently damage learned manifolds, distorting concepts in close semantic proximity. We identify limitations in current model editing techniques, showing that even benign, proximal concepts may become misaligned. To address the need for safe content generation, we leverage safe embeddings and a modified diffusion process with tunable weighted summation in the latent space to generate safer images. Our method preserves global context without compromising the structural integrity of the learned manifolds. We achieve state-of-the-art results on safe image generation benchmarks and offer intuitive control over the level of model safety. We identify trade-offs between safety and censorship, which presents a necessary perspective in the development of ethical AI models. We will release our code. Keywords: Text-to-Image Models, Generative AI, Safety, Reliability, Model Editing

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes