CRLGAug 11, 2020

Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks

arXiv:2008.04495v7157 citationsHas Code
AI Analysis

This provides a theoretical guarantee for practitioners in security-critical domains, though it is incremental as it builds on the well-known bagging method.

The paper tackles the problem of data poisoning attacks on machine learning models by proving that Bootstrap Aggregating (bagging) has intrinsic certified robustness, achieving a certified accuracy of 91.1% on MNIST when up to 100 training examples are arbitrarily modified, deleted, or inserted.

In a \emph{data poisoning attack}, an attacker modifies, deletes, and/or inserts some training examples to corrupt the learnt machine learning model. \emph{Bootstrap Aggregating (bagging)} is a well-known ensemble learning method, which trains multiple base models on random subsamples of a training dataset using a base learning algorithm and uses majority vote to predict labels of testing examples. We prove the intrinsic certified robustness of bagging against data poisoning attacks. Specifically, we show that bagging with an arbitrary base learning algorithm provably predicts the same label for a testing example when the number of modified, deleted, and/or inserted training examples is bounded by a threshold. Moreover, we show that our derived threshold is tight if no assumptions on the base learning algorithm are made. We evaluate our method on MNIST and CIFAR10. For instance, our method achieves a certified accuracy of $91.1\%$ on MNIST when arbitrarily modifying, deleting, and/or inserting 100 training examples. Code is available at: \url{https://github.com/jjy1994/BaggingCertifyDataPoisoning}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes