CVFeb 13, 2025

Pulling Back the Curtain: Unsupervised Adversarial Detection via Contrastive Auxiliary Networks

arXiv:2502.09110v36.25 citationsh-index: 82025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

Originality Highly original

AI Analysis

This work addresses the problem of adversarial attack detection for users of deep learning models in safety-critical applications, providing an incremental yet effective solution.

The authors tackled the problem of detecting adversarial attacks on deep learning models, achieving superior F1 scores against four distinct attack methods with their proposed U-CAN method. Their approach demonstrated effectiveness across multiple datasets and architectures.

Deep learning models are widely employed in safety-critical applications yet remain susceptible to adversarial attacks -- imperceptible perturbations that can significantly degrade model performance. Conventional defense mechanisms predominantly focus on either enhancing model robustness or detecting adversarial inputs independently. In this work, we propose an Unsupervised adversarial detection via Contrastive Auxiliary Networks (U-CAN) to uncover adversarial behavior within auxiliary feature representations, without the need for adversarial examples. U-CAN is embedded within selected intermediate layers of the target model. These auxiliary networks, comprising projection layers and ArcFace-based linear layers, refine feature representations to more effectively distinguish between benign and adversarial inputs. Comprehensive experiments across multiple datasets (CIFAR-10, Mammals, and a subset of ImageNet) and architectures (ResNet-50, VGG-16, and ViT) demonstrate that our method surpasses existing unsupervised adversarial detection techniques, achieving superior F1 scores against four distinct attack methods. The proposed framework provides a scalable and effective solution for enhancing the security and reliability of deep learning systems.

View on arXiv PDF

Similar