DS CY LGFeb 9, 2024

A Scalable Algorithm for Individually Fair K-means Clustering

MohammadHossein Bateni, Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi

arXiv:2402.06730v18.014 citationsh-index: 22AISTATS

Originality Incremental advance

AI Analysis

This addresses the need for efficient practical algorithms with theoretical guarantees for individually fair clustering, which is important for applications requiring fairness in machine learning, though it is incremental as it builds on existing problem formulations.

The paper tackles the individually fair k-means clustering problem by designing a scalable local-search algorithm that runs in ~O(nk^2) time and achieves a bicriteria (O(1), 6) approximation, with empirical results showing it is faster and produces lower-cost solutions than prior methods.

We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al. Given $n$ points $P$ in a metric space, let $δ(x)$ for $x\in P$ be the radius of the smallest ball around $x$ containing at least $n / k$ points. A clustering is then called individually fair if it has centers within distance $δ(x)$ of $x$ for each $x\in P$. While good approximation algorithms are known for this problem no efficient practical algorithms with good theoretical guarantees have been presented. We design the first fast local-search algorithm that runs in ~$O(nk^2)$ time and obtains a bicriteria $(O(1), 6)$ approximation. Then we show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.

View on arXiv PDF

Similar