Fairness Degrading Adversarial Attacks Against Clustering Algorithms
This work addresses a novel vulnerability in fair clustering algorithms, which is important for societal resource allocation, but it is incremental as it builds on known adversarial attack methods in machine learning.
The paper tackles the problem of adversarial attacks that degrade fairness in clustering algorithms, specifically targeting k-median clustering under a whitebox threat model, and finds that generated adversarial samples can significantly lower fairness values, with empirical results showing reductions in fairness metrics.
Clustering algorithms are ubiquitous in modern data science pipelines, and are utilized in numerous fields ranging from biology to facility location. Due to their widespread use, especially in societal resource allocation problems, recent research has aimed at making clustering algorithms fair, with great success. Furthermore, it has also been shown that clustering algorithms, much like other machine learning algorithms, are susceptible to adversarial attacks where a malicious entity seeks to subvert the performance of the learning algorithm. However, despite these known vulnerabilities, there has been no research undertaken that investigates fairness degrading adversarial attacks for clustering. We seek to bridge this gap by formulating a generalized attack optimization problem aimed at worsening the group-level fairness of centroid-based clustering algorithms. As a first step, we propose a fairness degrading attack algorithm for k-median clustering that operates under a whitebox threat model -- where the clustering algorithm, fairness notion, and the input dataset are known to the adversary. We provide empirical results as well as theoretical analysis for our simple attack algorithm, and find that the addition of the generated adversarial samples can lead to significantly lower fairness values. In this manner, we aim to motivate fairness degrading adversarial attacks as a direction for future research in fair clustering.