Robust Fair Clustering: A Novel Fairness Attack and Defense Framework
This work addresses the critical need for robust fairness in clustering applications like loan approvals, where adversarial manipulation could harm individuals, representing a novel contribution to the field.
The authors tackled the vulnerability of fair clustering algorithms to adversarial attacks by introducing a novel black-box fairness attack that significantly reduces fairness performance, and proposed Consensus Fair Clustering (CFC), a robust defense method that shows high resilience to such attacks.
Clustering algorithms are widely used in many societal resource allocation applications, such as loan approvals and candidate recruitment, among others, and hence, biased or unfair model outputs can adversely impact individuals that rely on these applications. To this end, many fair clustering approaches have been recently proposed to counteract this issue. Due to the potential for significant harm, it is essential to ensure that fair clustering algorithms provide consistently fair outputs even under adversarial influence. However, fair clustering algorithms have not been studied from an adversarial attack perspective. In contrast to previous research, we seek to bridge this gap and conduct a robustness analysis against fair clustering by proposing a novel black-box fairness attack. Through comprehensive experiments, we find that state-of-the-art models are highly susceptible to our attack as it can reduce their fairness performance significantly. Finally, we propose Consensus Fair Clustering (CFC), the first robust fair clustering approach that transforms consensus clustering into a fair graph partitioning problem, and iteratively learns to generate fair cluster outputs. Experimentally, we observe that CFC is highly robust to the proposed attack and is thus a truly robust fair clustering alternative.