LGNov 16, 2025

On Robustness of Linear Classifiers to Targeted Data Poisoning

Nakshatra Gupta, Sumanth Prabhu, Supratik Chakraborty, R Venkatesh

arXiv:2511.12722v14.1

Originality Incremental advance

AI Analysis

This addresses the security of linear classifiers for users in adversarial settings, but it is incremental as it builds on existing poisoning attack models.

The paper tackles the problem of measuring a dataset's robustness against targeted data poisoning attacks, where an adversary manipulates training labels to alter classification of a targeted test point, and presents a technique that efficiently computes lower and upper bounds for robustness, showing that poisoning exceeding these bounds significantly impacts classification and works in cases where state-of-the-art methods fail.

Data poisoning is a training-time attack that undermines the trustworthiness of learned models. In a targeted data poisoning attack, an adversary manipulates the training dataset to alter the classification of a targeted test point. Given the typically large size of training dataset, manual detection of poisoning is difficult. An alternative is to automatically measure a dataset's robustness against such an attack, which is the focus of this paper. We consider a threat model wherein an adversary can only perturb the labels of the training dataset, with knowledge limited to the hypothesis space of the victim's model. In this setting, we prove that finding the robustness is an NP-Complete problem, even when hypotheses are linear classifiers. To overcome this, we present a technique that finds lower and upper bounds of robustness. Our implementation of the technique computes these bounds efficiently in practice for many publicly available datasets. We experimentally demonstrate the effectiveness of our approach. Specifically, a poisoning exceeding the identified robustness bounds significantly impacts test point classification. We are also able to compute these bounds in many more cases where state-of-the-art techniques fail.

View on arXiv PDF

Similar