AILGApr 7, 2025

Don't Lag, RAG: Training-Free Adversarial Detection Using RAG

arXiv:2504.04858v36 citationsh-index: 20Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of adversarial patch detection for vision systems, offering a practical, training-free solution that is incremental by building on existing VLMs and retrieval methods.

The paper tackles adversarial patch attacks on vision systems by proposing a training-free Visual Retrieval-Augmented Generation (VRAG) framework that uses Vision-Language Models for detection, achieving up to 95% accuracy with open-source models and 98% with a closed-source model.

Adversarial patch attacks pose a major threat to vision systems by embedding localized perturbations that mislead deep models. Traditional defense methods often require retraining or fine-tuning, making them impractical for real-world deployment. We propose a training-free Visual Retrieval-Augmented Generation (VRAG) framework that integrates Vision-Language Models (VLMs) for adversarial patch detection. By retrieving visually similar patches and images that resemble stored attacks in a continuously expanding database, VRAG performs generative reasoning to identify diverse attack types, all without additional training or fine-tuning. We extensively evaluate open-source large-scale VLMs, including Qwen-VL-Plus, Qwen2.5-VL-72B, and UI-TARS-72B-DPO, alongside Gemini-2.0, a closed-source model. Notably, the open-source UI-TARS-72B-DPO model achieves up to 95 percent classification accuracy, setting a new state-of-the-art for open-source adversarial patch detection. Gemini-2.0 attains the highest overall accuracy, 98 percent, but remains closed-source. Experimental results demonstrate VRAG's effectiveness in identifying a variety of adversarial patches with minimal human annotation, paving the way for robust, practical defenses against evolving adversarial patch attacks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes