CVDec 25, 2024

Distortion-Aware Adversarial Attacks on Bounding Boxes of Object Detectors

Pham Phuc, Son Vuong, Khang Nguyen, Tuan Dang

arXiv:2412.18815v13.72 citationsh-index: 3Has CodeVISIGRAPP : VISAPP

Originality Highly original

AI Analysis

This work addresses the vulnerability of object detectors to adversarial examples, which is a critical issue for real-world applications like autonomous driving and surveillance, though it is incremental in advancing attack techniques.

The paper tackles the problem of adversarial attacks on object detectors by proposing a novel method that perturbs object confidence scores during training, achieving success rates up to 100% in white-box and 98% in black-box attacks on state-of-the-art models like YOLOv8 and Faster R-CNN.

Deep learning-based object detection has become ubiquitous in the last decade due to its high accuracy in many real-world applications. With this growing trend, these models are interested in being attacked by adversaries, with most of the results being on classifiers, which do not match the context of practical object detection. In this work, we propose a novel method to fool object detectors, expose the vulnerability of state-of-the-art detectors, and promote later works to build more robust detectors to adversarial examples. Our method aims to generate adversarial images by perturbing object confidence scores during training, which is crucial in predicting confidence for each class in the testing phase. Herein, we provide a more intuitive technique to embed additive noises based on detected objects' masks and the training loss with distortion control over the original image by leveraging the gradient of iterative images. To verify the proposed method, we perform adversarial attacks against different object detectors, including the most recent state-of-the-art models like YOLOv8, Faster R-CNN, RetinaNet, and Swin Transformer. We also evaluate our technique on MS COCO 2017 and PASCAL VOC 2012 datasets and analyze the trade-off between success attack rate and image distortion. Our experiments show that the achievable success attack rate is up to $100$\% and up to $98$\% when performing white-box and black-box attacks, respectively. The source code and relevant documentation for this work are available at the following link: https://github.com/anonymous20210106/attack_detector

View on arXiv PDF Code

Similar