CV CLMay 17

SafeLens: Deliberate and Efficient Video Guardrails with Fast-and-Slow Screening

Shahriar Kabir Nahin, Hadi Askari, Muhao Chen, Anshuman Chhabra

arXiv:2605.1761094.1Has Code

Predicted impact top 10% in CV · last 90 daysOriginality Incremental advance

AI Analysis

For video content moderation, SafeLens provides an efficient and accurate solution that outperforms both open-source and closed-source models, demonstrating that efficient design can be more effective than scaling data or model size.

SafeLens introduces a fast-and-slow inference architecture for video guardrails, achieving state-of-the-art performance on real-world and AI-generated video benchmarks while significantly reducing inference cost compared to existing models like SafeWatch-8B, OmniGuard-7B, GPT-5.4, and Gemini-3.1-pro.

The rapid growth of online video platforms and AI-generated content has made reliable video guardrails a key challenge for safety and real-world deployment. While most videos can be screened through fast pattern recognition, a small subset requires deeper reasoning over temporally complex content and nuanced policy constraints. Existing approaches typically rely on large vision-language models applied uniformly across all inputs, resulting in high inference costs and inefficient allocation of computation. We propose SafeLens, a video guardrail framework that introduces a fast-and-slow inference architecture for efficient and accurate content moderation with variable computational cost across inputs. Additionally, we construct a high-quality dataset by applying influence-guided filtering to the SafeWatch Dataset, retaining only 2.4% of the original data. To further address limitations of training-time scaling, we enable test-time reasoning by augmenting the filtered data with structured Chain-of-Thought traces. Across real-world and AI-generated video benchmarks, SafeLens achieves state-of-the-art performance, outperforming strong open-source video guardrails (e.g., SafeWatch-8B, OmniGuard-7B) and closed-source models (e.g., GPT-5.4, Gemini-3.1-pro) while significantly reducing inference cost, demonstrating that efficient design serves to be more effective than scaling data or model size alone.

View on arXiv PDF

Similar