LGCYMLMay 8, 2021

Consistency of Constrained Spectral Clustering under Graph Induced Fair Planted Partitions

arXiv:2105.03714v216 citations
Originality Incremental advance
AI Analysis

This work addresses fairness in clustering for applications where sensitive attributes are not directly available, offering a theoretical foundation for practitioners, though it is incremental in extending existing spectral clustering methods.

The paper tackles the problem of enforcing individual-level fairness in spectral clustering when sensitive attributes are indirectly observed through a representation graph, and it provides the first consistency results for constrained spectral clustering under this constraint, with numerical validation.

Spectral clustering is popular among practitioners and theoreticians alike. While performance guarantees for spectral clustering are well understood, recent studies have focused on enforcing ``fairness'' in clusters, requiring them to be ``balanced'' with respect to a categorical sensitive node attribute (e.g. the race distribution in clusters must match the race distribution in the population). In this paper, we consider a setting where sensitive attributes indirectly manifest in an auxiliary \textit{representation graph} rather than being directly observed. This graph specifies node pairs that can represent each other with respect to sensitive attributes and is observed in addition to the usual \textit{similarity graph}. Our goal is to find clusters in the similarity graph while respecting a new individual-level fairness constraint encoded by the representation graph. We develop variants of unnormalized and normalized spectral clustering for this task and analyze their performance under a \emph{fair} planted partition model induced by the representation graph. This model uses both the cluster membership of the nodes and the structure of the representation graph to generate random similarity graphs. To the best of our knowledge, these are the first consistency results for constrained spectral clustering under an individual-level fairness constraint. Numerical results corroborate our theoretical findings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes