CVAIJul 21, 2025

SIA: Enhancing Safety via Intent Awareness for Vision-Language Models

arXiv:2507.16856v23 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses safety issues for users of VLMs in real-world applications, representing an incremental improvement over existing training-free methods.

The paper tackles the problem of latent safety risks in Vision-Language Models (VLMs) where multimodal inputs combine to reveal harmful intent, proposing SIA, a training-free framework that proactively detects harmful intent and guides safe responses, with experiments showing it consistently improves safety and outperforms prior training-free methods on benchmarks like SIUO, MM-SafetyBench, and HoliSafe.

With the growing deployment of Vision-Language Models (VLMs) in real-world applications, previously overlooked safety risks are becoming increasingly evident. In particular, seemingly innocuous multimodal inputs can combine to reveal harmful intent, leading to unsafe model outputs. While multimodal safety has received increasing attention, existing approaches often fail to address such latent risks, especially when harmfulness arises only from the interaction between modalities. We propose SIA (Safety via Intent Awareness), a training-free, intent-aware safety framework that proactively detects harmful intent in multimodal inputs and uses it to guide the generation of safe responses. SIA follows a three-stage process: (1) visual abstraction via captioning; (2) intent inference through few-shot chain-of-thought (CoT) prompting; and (3) intent-conditioned response generation. By dynamically adapting to the implicit intent inferred from an image-text pair, SIA mitigates harmful outputs without extensive retraining. Extensive experiments on safety benchmarks, including SIUO, MM-SafetyBench, and HoliSafe, show that SIA consistently improves safety and outperforms prior training-free methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes