LGGTApr 27, 2023

Proportionally Representative Clustering

arXiv:2304.13917v314 citationsh-index: 44
Originality Highly original
AI Analysis

This addresses fairness in unsupervised machine learning for clustering tasks, offering a novel approach that improves upon existing methods.

The paper tackles the problem of fairness in centroid clustering by proposing a new axiom called proportionally representative fairness (PRF), which ensures centroids reflect data distribution and clustering tightness, and designs efficient algorithms for unconstrained and discrete settings, achieving polynomial-time approximations for Proportional Fairness (PF).

In recent years, there has been a surge in effort to formalize notions of fairness in machine learning. We focus on centroid clustering--one of the fundamental tasks in unsupervised machine learning. We propose a new axiom ``proportionally representative fairness'' (PRF) that is designed for clustering problems where the selection of centroids reflects the distribution of data points and how tightly they are clustered together. Our fairness concept is not satisfied by existing fair clustering algorithms. We design efficient algorithms to achieve PRF both for unconstrained and discrete clustering problems. Our algorithm for the unconstrained setting is also the first known polynomial-time approximation algorithm for the well-studied Proportional Fairness (PF) axiom. Our algorithm for the discrete setting also matches the best known approximation factor for PF.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes