GuardReasoner-Omni: A Reasoning-based Multi-modal Guardrail for Text, Image, Video, and Audio
This work addresses the need for a unified, reasoning-driven guardrail across multiple modalities, offering improved safety moderation for AI systems.
GuardReasoner-Omni is a reasoning-based guardrail model for moderating text, image, video, and audio. It outperforms existing state-of-the-art baselines across multiple guardrail benchmarks, using a two-stage training pipeline with 181k samples and models at 3B and 7B parameters.
We present GuardReasoner-Omni, a reasoning-based guardrail model designed to moderate text, image, video, and audio data. First, we construct a comprehensive training corpus comprising 181k samples spanning these four modalities. Our training pipeline follows a two-stage paradigm to incentivize the model to deliberate before making decisions: (1) conducting SFT to cold-start the model with explicit reasoning capabilities and structural adherence; and (2) performing RL with a concise correctness reward to preserve accurate reasoning while suppressing redundant generation. We release a suite of models scaled at 3B and 7B parameters. Extensive experiments demonstrate that GuardReasoner-Omni achieves superior performance compared to existing state-of-the-art baselines across various guardrail benchmarks.