CVOct 16, 2023

DANAA: Towards transferable attacks with double adversarial neuron attribution

arXiv:2310.10427v213 citationsh-index: 12Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of creating transferable adversarial attacks for deep neural networks, which is incremental as it builds on existing feature-level attack methods.

The paper tackles the problem of improving transferability in feature-level adversarial attacks by proposing DANAA, a method that uses double adversarial neuron attribution for more accurate feature importance estimation, achieving state-of-the-art performance on benchmark datasets.

While deep neural networks have excellent results in many fields, they are susceptible to interference from attacking samples resulting in erroneous judgments. Feature-level attacks are one of the effective attack types, which targets the learnt features in the hidden layers to improve its transferability across different models. Yet it is observed that the transferability has been largely impacted by the neuron importance estimation results. In this paper, a double adversarial neuron attribution attack method, termed `DANAA', is proposed to obtain more accurate feature importance estimation. In our method, the model outputs are attributed to the middle layer based on an adversarial non-linear path. The goal is to measure the weight of individual neurons and retain the features that are more important towards transferability. We have conducted extensive experiments on the benchmark datasets to demonstrate the state-of-the-art performance of our method. Our code is available at: https://github.com/Davidjinzb/DANAA

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes