CVAug 8, 2025

SDEval: Safety Dynamic Evaluation for Multimodal Large Language Models

Hanqing Wang, Yuan Tian, Mingyu Liu, Zhenhao Zhang, Xiangyang Zhu

arXiv:2508.06142v18.42 citationsh-index: 4Has Code

Originality Incremental advance

AI Analysis

This addresses safety concerns for MLLM developers and users by providing a more robust evaluation method, though it is incremental as it builds on existing benchmarks.

The authors tackled the problem of outdated and contaminated safety benchmarks for Multimodal Large Language Models (MLLMs) by proposing SDEval, a dynamic evaluation framework that adjusts benchmark distribution and complexity, which significantly influenced safety evaluation and exposed safety limitations in experiments.

In the rapidly evolving landscape of Multimodal Large Language Models (MLLMs), the safety concerns of their outputs have earned significant attention. Although numerous datasets have been proposed, they may become outdated with MLLM advancements and are susceptible to data contamination issues. To address these problems, we propose \textbf{SDEval}, the \textit{first} safety dynamic evaluation framework to controllably adjust the distribution and complexity of safety benchmarks. Specifically, SDEval mainly adopts three dynamic strategies: text, image, and text-image dynamics to generate new samples from original benchmarks. We first explore the individual effects of text and image dynamics on model safety. Then, we find that injecting text dynamics into images can further impact safety, and conversely, injecting image dynamics into text also leads to safety risks. SDEval is general enough to be applied to various existing safety and even capability benchmarks. Experiments across safety benchmarks, MLLMGuard and VLSBench, and capability benchmarks, MMBench and MMVet, show that SDEval significantly influences safety evaluation, mitigates data contamination, and exposes safety limitations of MLLMs. Code is available at https://github.com/hq-King/SDEval

View on arXiv PDF Code

Similar