LGCYDec 29, 2022

Cluster-level Group Representativity Fairness in $k$-means Clustering

arXiv:2212.14467v1h-index: 18
Originality Incremental advance
AI Analysis

This addresses fairness in clustering for sensitive groups like race and gender, but it is incremental as it builds on existing centroid clustering methods.

The paper tackles unfairness in clustering where different groups are disadvantaged across clusters, developing a k-means-based algorithm to improve fairness for the worst-off group in each cluster. It shows significant fairness improvements with minimal impact on cluster coherence in real-world datasets.

There has been much interest recently in developing fair clustering algorithms that seek to do justice to the representation of groups defined along sensitive attributes such as race and gender. We observe that clustering algorithms could generate clusters such that different groups are disadvantaged within different clusters. We develop a clustering algorithm, building upon the centroid clustering paradigm pioneered by classical algorithms such as $k$-means, where we focus on mitigating the unfairness experienced by the most-disadvantaged group within each cluster. Our method uses an iterative optimisation paradigm whereby an initial cluster assignment is modified by reassigning objects to clusters such that the worst-off sensitive group within each cluster is benefitted. We demonstrate the effectiveness of our method through extensive empirical evaluations over a novel evaluation metric on real-world datasets. Specifically, we show that our method is effective in enhancing cluster-level group representativity fairness significantly at low impact on cluster coherence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes