CLJan 27

Automated Safety Benchmarking: A Multi-agent Pipeline for LVLMs

Xiangyang Zhu, Yuan Tian, Zicheng Zhang, Qi Jia, Chunyi Li, Renrui Zhang, Heng Li, Zongrui Wang, Wei Sun

arXiv:2601.19507v11.11 citationsh-index: 32

Originality Incremental advance

AI Analysis

This addresses the need for efficient and dynamic safety evaluation in LVLMs, which is crucial for real-world reliability, though it is incremental as it automates an existing benchmarking process.

The paper tackles the problem of labor-intensive and static safety benchmarking for large vision-language models (LVLMs) by proposing VLSafetyBencher, an automated multi-agent pipeline that constructs high-quality safety benchmarks within one week at minimal cost, achieving a 70% safety rate disparity between the most and least safe models.

Large vision-language models (LVLMs) exhibit remarkable capabilities in cross-modal tasks but face significant safety challenges, which undermine their reliability in real-world applications. Efforts have been made to build LVLM safety evaluation benchmarks to uncover their vulnerability. However, existing benchmarks are hindered by their labor-intensive construction process, static complexity, and limited discriminative power. Thus, they may fail to keep pace with rapidly evolving models and emerging risks. To address these limitations, we propose VLSafetyBencher, the first automated system for LVLM safety benchmarking. VLSafetyBencher introduces four collaborative agents: Data Preprocessing, Generation, Augmentation, and Selection agents to construct and select high-quality samples. Experiments validates that VLSafetyBencher can construct high-quality safety benchmarks within one week at a minimal cost. The generated benchmark effectively distinguish safety, with a safety rate disparity of 70% between the most and least safe models.

View on arXiv PDF

Similar