LGCRCVJun 12, 2022

Consistent Attack: Universal Adversarial Perturbation on Embodied Vision Navigation

Tsinghua
arXiv:2206.05751v414 citationsh-index: 51
AI Analysis

This work highlights serious security risks for real-world embodied AI systems, though it is incremental as it extends UAP to sequential decision settings.

The authors tackled the vulnerability of embodied vision navigation agents to universal adversarial perturbations (UAP) by formulating the problem as a δ-disturbed Markov Decision Process and proposing two Consistent Attack methods, which caused a significant performance drop in victim models on the PointGoal task in Habitat.

Embodied agents in vision navigation coupled with deep neural networks have attracted increasing attention. However, deep neural networks have been shown vulnerable to malicious adversarial noises, which may potentially cause catastrophic failures in Embodied Vision Navigation. Among different adversarial noises, universal adversarial perturbations (UAP), i.e., a constant image-agnostic perturbation applied on every input frame of the agent, play a critical role in Embodied Vision Navigation since they are computation-efficient and application-practical during the attack. However, existing UAP methods ignore the system dynamics of Embodied Vision Navigation and might be sub-optimal. In order to extend UAP to the sequential decision setting, we formulate the disturbed environment under the universal noise $δ$, as a $δ$-disturbed Markov Decision Process ($δ$-MDP). Based on the formulation, we analyze the properties of $δ$-MDP and propose two novel Consistent Attack methods, named Reward UAP and Trajectory UAP, for attacking Embodied agents, which consider the dynamic of the MDP and calculate universal noises by estimating the disturbed distribution and the disturbed Q function. For various victim models, our Consistent Attack can cause a significant drop in their performance in the PointGoal task in Habitat with different datasets and different scenes. Extensive experimental results indicate that there exist serious potential risks for applying Embodied Vision Navigation methods to the real world.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes