CVAIAug 2, 2025

Personalized Safety Alignment for Text-to-Image Diffusion Models

arXiv:2508.01151v22 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This addresses the need for personalized safety controls in generative AI for users with diverse preferences, though it is an incremental improvement over existing safety mechanisms.

The paper tackles the problem of uniform safety standards in text-to-image diffusion models by proposing Personalized Safety Alignment (PSA), which integrates user-specific profiles to adjust safety behaviors, resulting in improved harmful content suppression and better alignment with user constraints as measured by higher Win Rate and Pass Rate scores.

Text-to-image diffusion models have revolutionized visual content generation, but current safety mechanisms apply uniform standards that often fail to account for individual user preferences. These models overlook the diverse safety boundaries shaped by factors like age, mental health, and personal beliefs. To address this, we propose Personalized Safety Alignment (PSA), a framework that allows user-specific control over safety behaviors in generative models. PSA integrates personalized user profiles into the diffusion process, adjusting the model's behavior to match individual safety preferences while preserving image quality. We introduce a new dataset, Sage, which captures user-specific safety preferences and incorporates these profiles through a cross-attention mechanism. Experiments show that PSA outperforms existing methods in harmful content suppression and aligns generated content better with user constraints, achieving higher Win Rate and Pass Rate scores. Our code, data, and models are publicly available at https://m-e-agi-lab.github.io/PSAlign/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes