MLLGJun 8, 2020

A Notion of Individual Fairness for Clustering

arXiv:2006.04960v132 citations
Originality Incremental advance
AI Analysis

This addresses the problem of ensuring fairness in clustering algorithms for data scientists and policymakers, but it is incremental as it extends individual fairness concepts from classification to clustering.

The paper tackles the lack of a notion of individual fairness in clustering by proposing one where each data point is, on average, closer to points in its own cluster than to those in others, and shows NP-hardness for general cases but provides an efficient algorithm for data on a real line and heuristics for general data.

A common distinction in fair machine learning, in particular in fair classification, is between group fairness and individual fairness. In the context of clustering, group fairness has been studied extensively in recent years; however, individual fairness for clustering has hardly been explored. In this paper, we propose a natural notion of individual fairness for clustering. Our notion asks that every data point, on average, is closer to the points in its own cluster than to the points in any other cluster. We study several questions related to our proposed notion of individual fairness. On the negative side, we show that deciding whether a given data set allows for such an individually fair clustering in general is NP-hard. On the positive side, for the special case of a data set lying on the real line, we propose an efficient dynamic programming approach to find an individually fair clustering. For general data sets, we investigate heuristics aimed at minimizing the number of individual fairness violations and compare them to standard clustering approaches on real data sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes