CV CR LGNov 24, 2023

Trainwreck: A damaging adversarial attack on image classifiers

arXiv:2311.14772v22.81 citationsh-index: 12Has Code

Originality Incremental advance

AI Analysis

This addresses security concerns for computer vision models in applied practice, representing a novel but incremental advancement in adversarial attacks.

The paper tackles the problem of damaging adversarial attacks on image classifiers by proposing Trainwreck, a train-time attack that poisons training data to degrade model performance, achieving similar or better potency than state-of-the-art methods on datasets like CIFAR-10 and CIFAR-100.

Adversarial attacks are an important security concern for computer vision (CV). As CV models are becoming increasingly valuable assets in applied practice, disrupting them is emerging as a form of economic sabotage. This paper opens up the exploration of damaging adversarial attacks (DAAs) that seek to damage target CV models. DAAs are formalized by defining the threat model, the cost function DAAs maximize, and setting three requirements for success: potency, stealth, and customizability. As a pioneer DAA, this paper proposes Trainwreck, a train-time attack that conflates the data of similar classes in the training data using stealthy ($ε\leq 8/255$) class-pair universal perturbations obtained from a surrogate model. Trainwreck is a black-box, transferable attack: it requires no knowledge of the target architecture, and a single poisoned dataset degrades the performance of any model trained on it. The experimental evaluation on CIFAR-10 and CIFAR-100 and various model architectures (EfficientNetV2, ResNeXt-101, and a finetuned ViT-L-16) demonstrates Trainwreck's efficiency. Trainwreck achieves similar or better potency compared to the data poisoning state of the art and is fully customizable by the poison rate parameter. Finally, data redundancy with hashing is identified as a reliable defense against Trainwreck or similar DAAs. The code is available at https://github.com/JanZahalka/trainwreck.

View on arXiv PDF Code

Similar