Generalizing Fair Top-$k$ Selection: An Integrative Approach

arXiv:2603.04689v1
Originality Highly original
AI Analysis

This work addresses the computational intractability of fair top-k selection for multiple protected groups, which is a problem for practitioners needing to ensure proportional representation in selection processes.

This paper generalizes fair top-k selection to multiple protected groups and minimizes disparity from a reference scoring function. It finds that the problem becomes computationally intractable even for small k and two-dimensional datasets when considering a critical issue affecting fairness, but recovers efficiency for small k and a small number of protected groups. The paper also introduces an alternative disparity measure, utility loss, which yields a more stable scoring function.

Fair top-$k$ selection, which ensures appropriate proportional representation of members from minority or historically disadvantaged groups among the top-$k$ selected candidates, has drawn significant attention. We study the problem of finding a fair (linear) scoring function with multiple protected groups while also minimizing the disparity from a reference scoring function. This generalizes the prior setup, which was restricted to the single-group setting without disparity minimization. Previous studies imply that the number of protected groups may have a limited impact on the runtime efficiency. However, driven by the need for experimental exploration, we find that this implication overlooks a critical issue that may affect the fairness of the outcome. Once this issue is properly considered, our hardness analysis shows that the problem may become computationally intractable even for a two-dimensional dataset and small values of $k$. However, our analysis also reveals a gap in the hardness barrier, enabling us to recover the efficiency for the case of small $k$ when the number of protected groups is sufficiently small. Furthermore, beyond measuring disparity as the "distance" between the fair and the reference scoring functions, we introduce an alternative disparity measure$\unicode{x2014}$utility loss$\unicode{x2014}$that may yield a more stable scoring function under small weight perturbations. Through careful engineering trade-offs that balance implementation complexity, robustness, and performance, our augmented two-pronged solution demonstrates strong empirical performance on real-world datasets, with experimental observations also informing algorithm design and implementation decisions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes