LGCYDSAug 22, 2022

Socially Fair Center-based and Linear Subspace Clustering

Amazon
arXiv:2208.10095v12 citationsh-index: 22
Originality Highly original
AI Analysis

It addresses fairness-related harms, such as unequal quality-of-service, in clustering for applications involving sensitive demographic data.

The paper tackles the problem of fairness in clustering when data includes sensitive demographic groups, proposing a unified framework for socially fair center-based and linear subspace clustering with efficient approximation algorithms that match or outperform state-of-the-art baselines on multiple benchmark datasets.

Center-based clustering (e.g., $k$-means, $k$-medians) and clustering using linear subspaces are two most popular techniques to partition real-world data into smaller clusters. However, when the data consists of sensitive demographic groups, significantly different clustering cost per point for different sensitive groups can lead to fairness-related harms (e.g., different quality-of-service). The goal of socially fair clustering is to minimize the maximum cost of clustering per point over all groups. In this work, we propose a unified framework to solve socially fair center-based clustering and linear subspace clustering, and give practical, efficient approximation algorithms for these problems. We do extensive experiments to show that on multiple benchmark datasets our algorithms either closely match or outperform state-of-the-art baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes