Bait and Switch: Online Training Data Poisoning of Autonomous Driving Systems
This addresses a security vulnerability for autonomous vehicles, but it is incremental as it builds on existing data poisoning concepts in a specific domain.
The paper tackles the problem of online training data poisoning in autonomous driving systems by showing that an adversary can inject subtle environmental perturbations to degrade a DNN's performance, specifically reducing a traffic light classifier's accuracy during operation.
We show that by controlling parts of a physical environment in which a pre-trained deep neural network (DNN) is being fine-tuned online, an adversary can launch subtle data poisoning attacks that degrade the performance of the system. While the attack can be applied in general to any perception task, we consider a DNN based traffic light classifier for an autonomous car that has been trained in one city and is being fine-tuned online in another city. We show that by injecting environmental perturbations that do not modify the traffic lights themselves or ground-truth labels, the adversary can cause the deep network to learn spurious concepts during the online learning phase. The attacker can leverage the introduced spurious concepts in the environment to cause the model's accuracy to degrade during operation; therefore, causing the system to malfunction.