LG CR MLMay 28, 2019

An Investigation of Data Poisoning Defenses for Online Learning

Yizhen Wang, Somesh Jha, Kamalika Chaudhuri

arXiv:1905.12121v35.45 citations

Originality Synthesis-oriented

AI Analysis

This work addresses security threats for machine learning systems in applications vulnerable to data poisoning, but it is incremental as it builds on prior attacks and defenses without introducing new methods.

The paper investigates defenses against data poisoning attacks in online learning, analyzing conditions under which four standard defenses resist or allow rapid poisoning and showing that adversary success depends on the learning problem's difficulty in a realistic threat model.

Data poisoning attacks -- where an adversary can modify a small fraction of training data, with the goal of forcing the trained classifier to high loss -- are an important threat for machine learning in many applications. While a body of prior work has developed attacks and defenses, there is not much general understanding on when various attacks and defenses are effective. In this work, we undertake a rigorous study of defenses against data poisoning for online learning. First, we study four standard defenses in a powerful threat model, and provide conditions under which they can allow or resist rapid poisoning. We then consider a weaker and more realistic threat model, and show that the success of the adversary in the presence of data poisoning defenses there depends on the "ease" of the learning problem.

View on arXiv PDF

Similar