56.7DSMar 24Code
Gabow's $O(\sqrt{n}m)$ Maximum Cardinality Matching Algorithm, RevisitedKurt Mehlhorn, Romina Nobahari
We revisit Gabow's $O(\sqrt{n} m)$ maximum cardinality matching algorithm (The Weighted Matching Approach to Maximum Cardinality Matching, Fundamenta Informaticae, 2017). It adapts the weighted matching algorithm of Gabow and Tarjan~\cite{GT91} to maximum cardinality matching. Gabow's algorithm works iteratively. In each iteration, it constructs a maximal number of edge-disjoint shortest augmenting paths with respect to the current matching and augments them. It is well-known that $O(\sqrt{n})$ iterations suffice. Each iteration consists of three parts. In the first part, the length of the shortest augmenting path is computed. In the second part, an auxiliary graph $H$ is constructed with the property that shortest augmenting paths in $G$ correspond to augmenting paths in $H$. In the third part, a maximal set of edge-disjoint augmenting paths in $H$ is determined, and the paths are lifted to and augmented to $G$. We give a new algorithm for the first part. Gabow's algorithm for the first part is derived from Edmonds' primal-dual algorithm for weighted matching. We believe that our approach is more direct and will be easier to teach. We have implemented the algorithm; the implementation is available at the companion webpage (https://people.mpi-inf.mpg.de/~mehlhorn/CompanionPageGenMatchingImplementation.html).
LGJun 10, 2025
Towards Fair Representation: Clustering and ConsensusDiptarka Chakraborty, Kushagra Chatterjee, Debarati Das et al.
Consensus clustering, a fundamental task in machine learning and data analysis, aims to aggregate multiple input clusterings of a dataset, potentially based on different non-sensitive attributes, into a single clustering that best represents the collective structure of the data. In this work, we study this fundamental problem through the lens of fair clustering, as introduced by Chierichetti et al. [NeurIPS'17], which incorporates the disparate impact doctrine to ensure proportional representation of each protected group in the dataset within every cluster. Our objective is to find a consensus clustering that is not only representative but also fair with respect to specific protected attributes. To the best of our knowledge, we are the first to address this problem and provide a constant-factor approximation. As part of our investigation, we examine how to minimally modify an existing clustering to enforce fairness -- an essential postprocessing step in many clustering applications that require fair representation. We develop an optimal algorithm for datasets with equal group representation and near-linear time constant factor approximation algorithms for more general scenarios with different proportions of two group sizes. We complement our approximation result by showing that the problem is NP-hard for two unequal-sized groups. Given the fundamental nature of this problem, we believe our results on Closest Fair Clustering could have broader implications for other clustering problems, particularly those for which no prior approximation guarantees exist for their fair variants.