CRCVLGOct 25, 2020

Dynamic Adversarial Patch for Evading Object Detection Models

arXiv:2010.13070v152 citations
Originality Incremental advance
AI Analysis

This addresses security vulnerabilities in computer vision systems for applications like autonomous vehicles, though it is incremental over existing patch-based attacks.

The paper tackles the problem of real-world adversarial attacks on object detectors by introducing dynamic adversarial patches that adjust to camera positions, achieving up to 90% evasion on YOLOv2 and 71% transferability across car models.

Recent research shows that neural networks models used for computer vision (e.g., YOLO and Fast R-CNN) are vulnerable to adversarial evasion attacks. Most of the existing real-world adversarial attacks against object detectors use an adversarial patch which is attached to the target object (e.g., a carefully crafted sticker placed on a stop sign). This method may not be robust to changes in the camera's location relative to the target object; in addition, it may not work well when applied to nonplanar objects such as cars. In this study, we present an innovative attack method against object detectors applied in a real-world setup that addresses some of the limitations of existing attacks. Our method uses dynamic adversarial patches which are placed at multiple predetermined locations on a target object. An adversarial learning algorithm is applied in order to generate the patches used. The dynamic attack is implemented by switching between optimized patches dynamically, according to the camera's position (i.e., the object detection system's position). In order to demonstrate our attack in a real-world setup, we implemented the patches by attaching flat screens to the target object; the screens are used to present the patches and switch between them, depending on the current camera location. Thus, the attack is dynamic and adjusts itself to the situation to achieve optimal results. We evaluated our dynamic patch approach by attacking the YOLOv2 object detector with a car as the target object and succeeded in misleading it in up to 90% of the video frames when filming the car from a wide viewing angle range. We improved the attack by generating patches that consider the semantic distance between the target object and its classification. We also examined the attack's transferability among different car models and were able to mislead the detector 71% of the time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes