Optimal Feature Manipulation Attacks Against Linear Regression
This work addresses security vulnerabilities in linear regression for applications like machine learning systems, but it is incremental as it builds on existing poisoning attack methods.
The paper tackles the problem of manipulating linear regression coefficients through data poisoning, providing closed-form solutions for single-coefficient attacks and semidefinite relaxation for targeted attacks, with numerical examples illustrating the results.
In this paper, we investigate how to manipulate the coefficients obtained via linear regression by adding carefully designed poisoning data points to the dataset or modify the original data points. Given the energy budget, we first provide the closed-form solution of the optimal poisoning data point when our target is modifying one designated regression coefficient. We then extend the analysis to the more challenging scenario where the attacker aims to change one particular regression coefficient while making others to be changed as small as possible. For this scenario, we introduce a semidefinite relaxation method to design the best attack scheme. Finally, we study a more powerful adversary who can perform a rank-one modification on the feature matrix. We propose an alternating optimization method to find the optimal rank-one modification matrix. Numerical examples are provided to illustrate the analytical results obtained in this paper.