LG MLSep 15, 2020

Data Poisoning Attacks on Regression Learning and Corresponding Defenses

Nicolas Michael Müller, Daniel Kowatsch, Konstantin Böttinger

arXiv:2009.07008v15.827 citationsh-index: 12Has Code

Originality Incremental advance

AI Analysis

This addresses security threats in mission-critical regression systems like medical dosing and cyber-physical control, though it is incremental by extending poisoning studies from classification to regression.

The paper tackles data poisoning attacks on regression learning, showing that inserting just 2% of poison samples increases the mean squared error to 150%, and proposes a defense strategy that effectively mitigates these attacks across 26 datasets.

Adversarial data poisoning is an effective attack against machine learning and threatens model integrity by introducing poisoned data into the training dataset. So far, it has been studied mostly for classification, even though regression learning is used in many mission critical systems (such as dosage of medication, control of cyber-physical systems and managing power supply). Therefore, in the present research, we aim to evaluate all aspects of data poisoning attacks on regression learning, exceeding previous work both in terms of breadth and depth. We present realistic scenarios in which data poisoning attacks threaten production systems and introduce a novel black-box attack, which is then applied to a real-word medical use-case. As a result, we observe that the mean squared error (MSE) of the regressor increases to 150 percent due to inserting only two percent of poison samples. Finally, we present a new defense strategy against the novel and previous attacks and evaluate it thoroughly on 26 datasets. As a result of the conducted experiments, we conclude that the proposed defence strategy effectively mitigates the considered attacks.

View on arXiv PDF Code

Similar