VisuoAlign: Safety Alignment of LVLMs with Multimodal Tree Search
This addresses safety vulnerabilities in multimodal AI systems, which is critical for real-world deployment, though it appears incremental as it builds on existing alignment methods with a novel multimodal approach.
The paper tackles the challenge of safety alignment in Large Vision-Language Models (LVLMs) against multimodal jailbreaks, proposing VisuoAlign, a framework that uses prompt-guided tree search to embed safety constraints into reasoning, resulting in significant improvements in robustness against cross-modal threats.
Large Vision-Language Models (LVLMs) have achieved remarkable progress in multimodal perception and generation, yet their safety alignment remains a critical challenge.Existing defenses and vulnerable to multimodal jailbreaks, as visual inputs introduce new attack surfaces, reasoning chains lack safety supervision, and alignment often degrades under modality fusion.To overcome these limitation, we propose VisuoAlign, a framework for multi-modal safety alignment via prompt-guided tree search.VisuoAlign embeds safety constrains into the reasoning process through visual-textual interactive prompts, employs Monte Carlo Tree Search(MCTS) to systematically construct diverse safety-critical prompt trajectories, and introduces prompt-based scaling to ensure real-time risk detection and compliant responses.Extensive experiments demonstrate that VisuoAlign proactively exposes risks, enables comprehensive dataset generation, and significantly improves the robustness of LVLMs against complex cross-modal threats.