LG AI CRJan 29, 2025

Topological Signatures of Adversaries in Multimodal Alignments

Minh Vu, Geigh Zollicoffer, Huy Mai, Ben Nebgen, Boian Alexandrov, Manish Bhattarai

arXiv:2501.18006v19.42 citationsh-index: 25ICML

Originality Incremental advance

AI Analysis

This addresses the underexplored issue of adversarial robustness in multimodal machine learning systems, which is incremental as it builds on existing unimodal defenses.

The paper tackles the problem of adversarial attacks on multimodal systems like CLIP/BLIP by investigating topological signatures in image-text alignments, showing that adversarial perturbations disrupt these alignments and introducing novel topological-contrastive losses and an algorithm for improved detection.

Multimodal Machine Learning systems, particularly those aligning text and image data like CLIP/BLIP models, have become increasingly prevalent, yet remain susceptible to adversarial attacks. While substantial research has addressed adversarial robustness in unimodal contexts, defense strategies for multimodal systems are underexplored. This work investigates the topological signatures that arise between image and text embeddings and shows how adversarial attacks disrupt their alignment, introducing distinctive signatures. We specifically leverage persistent homology and introduce two novel Topological-Contrastive losses based on Total Persistence and Multi-scale kernel methods to analyze the topological signatures introduced by adversarial perturbations. We observe a pattern of monotonic changes in the proposed topological losses emerging in a wide range of attacks on image-text alignments, as more adversarial samples are introduced in the data. By designing an algorithm to back-propagate these signatures to input samples, we are able to integrate these signatures into Maximum Mean Discrepancy tests, creating a novel class of tests that leverage topological signatures for better adversarial detection.

View on arXiv PDF

Similar