StructBreak: Structural Cognitive Overload-Induced Safety Failures in MLLMs
For developers and users of multimodal LLMs, this work exposes a critical safety vulnerability arising from structural reasoning, highlighting the inadequacy of existing alignment methods.
The paper identifies Structural Cognitive Overload (SCO) as a cause of safety failures in MLLMs and proposes StructBreak, an automated framework to exploit this vulnerability. Under black-box settings, it achieves a 92% average attack success rate (up to 97% on Gemini 2.5) across ten threat scenarios, revealing that current alignment paradigms are insufficient for complex multimodal reasoning.
Multimodal Large Language Models (MLLMs) excel at structural reasoning yet suffer from a sharp logical brittleness in structural consistency. We term this phenomenon Structural Cognitive Overload (SCO), a byproduct of the contention between deep reasoning and safety alignment. However, prior work has predominantly targeted typographic and pixel-level perturbations, leaving the study of SCO largely unexplored. To this end, we propose StructBreak, an automated end-to-end framework designed to quantify SCO. By leveraging StructBreak, we uncover a novel higher-order cognitive overload attack paradigm; notably, this attack operates under a practical black-box setting, requiring no internal model access. Consequently, we utilize this framework to establish a comprehensive benchmark spanning ten diverse threat scenarios. Empirical evaluations on six leading MLLMs reveal that SCO readily triggers toxic generation, yielding a 92% average ASR (up to 97% on Gemini 2.5). To elucidate the mechanism of SCO, we further conduct model-level interpretations spanning attention dynamics, latent space topology, and geometric analysis. Our findings reveal that StructBreak acts as a novel structural channel to circumvent safety filters. Furthermore, the limited efficacy of inherent safety mechanisms underscores that current alignment paradigms are insufficient for the era of complex multimodal reasoning.