ML LGJun 8, 2020

A Notion of Individual Fairness for Clustering

Matthäus Kleindessner, Pranjal Awasthi, Jamie Morgenstern

arXiv:2006.04960v115.532 citations

Originality Incremental advance

AI Analysis

This addresses the problem of ensuring fairness in clustering algorithms for data scientists and policymakers, but it is incremental as it extends individual fairness concepts from classification to clustering.

The paper tackles the lack of a notion of individual fairness in clustering by proposing one where each data point is, on average, closer to points in its own cluster than to those in others, and shows NP-hardness for general cases but provides an efficient algorithm for data on a real line and heuristics for general data.

A common distinction in fair machine learning, in particular in fair classification, is between group fairness and individual fairness. In the context of clustering, group fairness has been studied extensively in recent years; however, individual fairness for clustering has hardly been explored. In this paper, we propose a natural notion of individual fairness for clustering. Our notion asks that every data point, on average, is closer to the points in its own cluster than to the points in any other cluster. We study several questions related to our proposed notion of individual fairness. On the negative side, we show that deciding whether a given data set allows for such an individually fair clustering in general is NP-hard. On the positive side, for the special case of a data set lying on the real line, we propose an efficient dynamic programming approach to find an individually fair clustering. For general data sets, we investigate heuristics aimed at minimizing the number of individual fairness violations and compare them to standard clustering approaches on real data sets.

View on arXiv PDF

Similar