LGCRCVDec 14, 2025

GradID: Adversarial Detection via Intrinsic Dimensionality of Gradients

arXiv:2512.12827v1
Originality Highly original
AI Analysis

This addresses the critical vulnerability of deep neural networks to adversarial perturbations for applications like medical diagnosis and autonomous driving.

The paper tackles the problem of detecting adversarial attacks on deep neural networks by analyzing the intrinsic dimensionality of gradient parameters, revealing consistent differences between natural and adversarial data. The method achieves state-of-the-art results with detection rates consistently above 92% on CIFAR-10 against various attacks.

Despite their remarkable performance, deep neural networks exhibit a critical vulnerability: small, often imperceptible, adversarial perturbations can lead to drastically altered model predictions. Given the stringent reliability demands of applications such as medical diagnosis and autonomous driving, robust detection of such adversarial attacks is paramount. In this paper, we investigate the geometric properties of a model's input loss landscape. We analyze the Intrinsic Dimensionality (ID) of the model's gradient parameters, which quantifies the minimal number of coordinates required to describe the data points on their underlying manifold. We reveal a distinct and consistent difference in the ID for natural and adversarial data, which forms the basis of our proposed detection method. We validate our approach across two distinct operational scenarios. First, in a batch-wise context for identifying malicious data groups, our method demonstrates high efficacy on datasets like MNIST and SVHN. Second, in the critical individual-sample setting, we establish new state-of-the-art results on challenging benchmarks such as CIFAR-10 and MS COCO. Our detector significantly surpasses existing methods against a wide array of attacks, including CW and AutoAttack, achieving detection rates consistently above 92\% on CIFAR-10. The results underscore the robustness of our geometric approach, highlighting that intrinsic dimensionality is a powerful fingerprint for adversarial detection across diverse datasets and attack strategies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes