MLCRLGMay 15, 2019

Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

arXiv:1905.05897v2324 citations
Originality Highly original
AI Analysis

This addresses security vulnerabilities in machine learning systems by enabling transferable attacks without access to victim details, though it is incremental in improving attack methods.

The paper tackles the problem of clean-label poisoning attacks on deep neural networks by proposing a polytope attack that surrounds the target image in feature space, achieving over 50% success rates with only 1% poisoning of the training set.

Clean-label poisoning attacks inject innocuous looking (and "correctly" labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data. We consider transferable poisoning attacks that succeed without access to the victim network's outputs, architecture, or (in some cases) training data. To achieve this, we propose a new "polytope attack" in which poison images are designed to surround the targeted image in feature space. We also demonstrate that using Dropout during poison creation helps to enhance transferability of this attack. We achieve transferable attack success rates of over 50% while poisoning only 1% of the training set.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes