SP-Guard: Selective Prompt-adaptive Guidance for Safe Text-to-Image Generation
This work addresses safety concerns in text-to-image generation for users and platforms, offering an incremental improvement over existing inference-time guiding methods.
The paper tackled the problem of harmful content generation in text-to-image models by proposing SP-Guard, a method that adapts guidance strength based on prompt harmfulness and selectively targets unsafe image regions, resulting in safer image generation with minimized unintended alterations compared to existing methods.
While diffusion-based T2I models have achieved remarkable image generation quality, they also enable easy creation of harmful content, raising social concerns and highlighting the need for safer generation. Existing inference-time guiding methods lack both adaptivity--adjusting guidance strength based on the prompt--and selectivity--targeting only unsafe regions of the image. Our method, SP-Guard, addresses these limitations by estimating prompt harmfulness and applying a selective guidance mask to guide only unsafe areas. Experiments show that SP-Guard generates safer images than existing methods while minimizing unintended content alteration. Beyond improving safety, our findings highlight the importance of transparency and controllability in image generation.