LGAICRCYOct 17, 2021

Poisoning Attacks on Fair Machine Learning

arXiv:2110.08932v131 citations
Originality Incremental advance
AI Analysis

This addresses a security vulnerability in fair ML systems, which is an incremental contribution as it extends adversarial attacks to fairness contexts.

The paper tackles the problem of attacking fair machine learning models by generating poisoning samples that reduce both model accuracy and algorithmic fairness, achieving effective results on two real datasets.

Both fair machine learning and adversarial learning have been extensively studied. However, attacking fair machine learning models has received less attention. In this paper, we present a framework that seeks to effectively generate poisoning samples to attack both model accuracy and algorithmic fairness. Our attacking framework can target fair machine learning models trained with a variety of group based fairness notions such as demographic parity and equalized odds. We develop three online attacks, adversarial sampling , adversarial labeling, and adversarial feature modification. All three attacks effectively and efficiently produce poisoning samples via sampling, labeling, or modifying a fraction of training data in order to reduce the test accuracy. Our framework enables attackers to flexibly adjust the attack's focus on prediction accuracy or fairness and accurately quantify the impact of each candidate point to both accuracy loss and fairness violation, thus producing effective poisoning samples. Experiments on two real datasets demonstrate the effectiveness and efficiency of our framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes