LGMLJun 19, 2019

Variational Fair Clustering

arXiv:1906.08207v513 citations
Originality Incremental advance
AI Analysis

This work addresses fairness in clustering for data science applications, offering a scalable method that is competitive with state-of-the-art approaches.

The authors tackled the problem of fair clustering by proposing a variational framework that integrates a KL fairness term with various clustering objectives, enabling control over the fairness-clustering trade-off and yielding scalable, competitive solutions on benchmarks.

We propose a general variational framework of fair clustering, which integrates an original Kullback-Leibler (KL) fairness term with a large class of clustering objectives, including prototype or graph based. Fundamentally different from the existing combinatorial and spectral solutions, our variational multi-term approach enables to control the trade-off levels between the fairness and clustering objectives. We derive a general tight upper bound based on a concave-convex decomposition of our fairness term, its Lipschitz-gradient property and the Pinsker's inequality. Our tight upper bound can be jointly optimized with various clustering objectives, while yielding a scalable solution, with convergence guarantee. Interestingly, at each iteration, it performs an independent update for each assignment variable. Therefore, it can be easily distributed for large-scale datasets. This scalability is important as it enables to explore different trade-off levels between the fairness and clustering objectives. Unlike spectral relaxation, our formulation does not require computing its eigenvalue decomposition. We report comprehensive evaluations and comparisons with state-of-the-art methods over various fair-clustering benchmarks, which show that our variational formulation can yield highly competitive solutions in terms of fairness and clustering objectives.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes