LGCRCVMLJul 5, 2019

Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions

arXiv:1907.02957v277 citations
Originality Incremental advance
AI Analysis

This work addresses the issue of adversarial vulnerabilities in AI systems, offering insights for improving robustness, though it is incremental as it builds on existing CapsNet and adversarial detection methods.

The paper tackles the problem of detecting adversarial examples in neural networks by using class-conditional reconstructions, finding that CapsNets outperform convolutional networks with a much smaller success rate for undetected attacks. It also shows that reconstructive attacks in CapsNets are linked to visual similarity between classes, suggesting these networks use features more aligned with human perception.

Adversarial examples raise questions about whether neural network models are sensitive to the same visual features as humans. In this paper, we first detect adversarial examples or otherwise corrupted images based on a class-conditional reconstruction of the input. To specifically attack our detection mechanism, we propose the Reconstructive Attack which seeks both to cause a misclassification and a low reconstruction error. This reconstructive attack produces undetected adversarial examples but with much smaller success rate. Among all these attacks, we find that CapsNets always perform better than convolutional networks. Then, we diagnose the adversarial examples for CapsNets and find that the success of the reconstructive attack is highly related to the visual similarity between the source and target class. Additionally, the resulting perturbations can cause the input image to appear visually more like the target class and hence become non-adversarial. This suggests that CapsNets use features that are more aligned with human perception and have the potential to address the central issue raised by adversarial examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes