LG CY DSAug 22, 2022

Socially Fair Center-based and Linear Subspace Clustering

Sruthi Gorantla, Kishen N. Gowda, Amit Deshpande, Anand Louis

Amazon

arXiv:2208.10095v13.32 citationsh-index: 22

Originality Highly original

AI Analysis

It addresses fairness-related harms, such as unequal quality-of-service, in clustering for applications involving sensitive demographic data.

The paper tackles the problem of fairness in clustering when data includes sensitive demographic groups, proposing a unified framework for socially fair center-based and linear subspace clustering with efficient approximation algorithms that match or outperform state-of-the-art baselines on multiple benchmark datasets.

Center-based clustering (e.g., $k$-means, $k$-medians) and clustering using linear subspaces are two most popular techniques to partition real-world data into smaller clusters. However, when the data consists of sensitive demographic groups, significantly different clustering cost per point for different sensitive groups can lead to fairness-related harms (e.g., different quality-of-service). The goal of socially fair clustering is to minimize the maximum cost of clustering per point over all groups. In this work, we propose a unified framework to solve socially fair center-based clustering and linear subspace clustering, and give practical, efficient approximation algorithms for these problems. We do extensive experiments to show that on multiple benchmark datasets our algorithms either closely match or outperform state-of-the-art baselines.

View on arXiv PDF

Similar