LG AI CVJul 14, 2023

Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning

arXiv:2307.07250v213.713 citationsh-index: 11Has Code

Originality Incremental advance

AI Analysis

This work addresses adversarial threats in AI systems, offering a novel causal method to enhance robustness, though it appears incremental as it builds on existing adversarial training frameworks.

The paper tackles the problem of adversarial vulnerability in deep neural networks by introducing Adversarial Double Machine Learning (ADML), a causal approach that quantifies vulnerability and estimates causal parameters of perturbations, resulting in improved adversarial robustness with large margins across CNN and Transformer architectures.

Adversarial examples derived from deliberately crafted perturbations on visual inputs can easily harm decision process of deep neural networks. To prevent potential threats, various adversarial training-based defense methods have grown rapidly and become a de facto standard approach for robustness. Despite recent competitive achievements, we observe that adversarial vulnerability varies across targets and certain vulnerabilities remain prevalent. Intriguingly, such peculiar phenomenon cannot be relieved even with deeper architectures and advanced defense methods. To address this issue, in this paper, we introduce a causal approach called Adversarial Double Machine Learning (ADML), which allows us to quantify the degree of adversarial vulnerability for network predictions and capture the effect of treatments on outcome of interests. ADML can directly estimate causal parameter of adversarial perturbations per se and mitigate negative effects that can potentially damage robustness, bridging a causal perspective into the adversarial vulnerability. Through extensive experiments on various CNN and Transformer architectures, we corroborate that ADML improves adversarial robustness with large margins and relieve the empirical observation.

View on arXiv PDF Code

Similar