DSLGMLFeb 17, 2020

Individual Fairness for $k$-Clustering

arXiv:2002.06742v2101 citations
AI Analysis

This work addresses fairness in clustering algorithms for data science applications, offering a novel theoretical and practical approach to ensure individual fairness, though it builds on prior notions and is incremental in advancing the field.

The paper tackles the problem of achieving individual fairness in k-clustering (e.g., k-median and k-means) by developing a local search algorithm that provides a bicriteria approximation, with theoretical guarantees showing constant-factor approximations for both cost and fairness, and includes empirical validation.

We give a local search based algorithm for $k$-median and $k$-means (and more generally for any $k$-clustering with $\ell_p$ norm cost function) from the perspective of individual fairness. More precisely, for a point $x$ in a point set $P$ of size $n$, let $r(x)$ be the minimum radius such that the ball of radius $r(x)$ centered at $x$ has at least $n/k$ points from $P$. Intuitively, if a set of $k$ random points are chosen from $P$ as centers, every point $x\in P$ expects to have a center within radius $r(x)$. An individually fair clustering provides such a guarantee for every point $x\in P$. This notion of fairness was introduced in [Jung et al., 2019] where they showed how to get an approximately feasible $k$-clustering with respect to this fairness condition. In this work, we show how to get a bicriteria approximation for fair $k$-clustering: The $k$-median ($k$-means) cost of our solution is within a constant factor of the cost of an optimal fair $k$-clustering, and our solution approximately satisfies the fairness condition (also within a constant factor). Further, we complement our theoretical bounds with empirical evaluation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes