CLAIJun 21, 2024

From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking

arXiv:2406.14859v134 citations
Originality Synthesis-oriented
AI Analysis

It addresses security issues for users and developers of MLLMs, but is incremental as it primarily reviews existing research rather than introducing new methods.

This paper tackles the problem of adversarial vulnerabilities in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) by providing a comprehensive overview of jailbreaking research, highlighting that the multimodal domain remains underexplored compared to unimodal approaches.

The rapid development of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has exposed vulnerabilities to various adversarial attacks. This paper provides a comprehensive overview of jailbreaking research targeting both LLMs and MLLMs, highlighting recent advancements in evaluation benchmarks, attack techniques and defense strategies. Compared to the more advanced state of unimodal jailbreaking, multimodal domain remains underexplored. We summarize the limitations and potential research directions of multimodal jailbreaking, aiming to inspire future research and further enhance the robustness and security of MLLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes