CVAILGJun 7, 2024

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

arXiv:2406.05113v333 citations
Originality Incremental advance
AI Analysis

This addresses the critical need for reliable safety guardrails in vision AI systems, though it appears incremental as it builds on existing VLM approaches with new data and customization.

The paper tackles the problem of ensuring safety in vision datasets and models by introducing LlavaGuard, an open VLM-based framework that includes a safety dataset and models, which outperforms state-of-the-art methods in accuracy and flexibility for policy handling.

This paper introduces LlavaGuard, a suite of VLM-based vision safeguards that address the critical need for reliable guardrails in the era of large-scale data and models. To this end, we establish a novel open framework, describing a customizable safety taxonomy, data preprocessing, augmentation, and training setup. For teaching a VLM safeguard on safety, we further create a multimodal safety dataset with high-quality human expert annotations, where each image is labeled with a safety rating, category, and rationale. We also employ advanced augmentations to support context-specific assessments. The resulting LlavaGuard models, ranging from 0.5B to 7B, serve as a versatile tool for evaluating the safety compliance of visual content against flexible policies. In comprehensive experiments, LlavaGuard outperforms both state-of-the-art safeguards and VLMs in accuracy and in flexibly handling different policies. Additionally, we demonstrate LlavaGuard's performance in two real-world applications: large-scale dataset annotation and moderation of text-to-image models. We make our entire framework, including the dataset, model weights, and training code.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes