Fairness in Streaming Submodular Maximization over a Matroid Constraint
This addresses fairness in subset selection for large-scale datasets with sensitive attributes, offering incremental improvements over existing methods for specific constraints.
The paper tackles the problem of ensuring fairness in streaming submodular maximization under a matroid constraint, extending prior work from cardinality constraints, and provides algorithms with empirical validation on real-world applications like clustering and recommendation.
Streaming submodular maximization is a natural model for the task of selecting a representative subset from a large-scale dataset. If datapoints have sensitive attributes such as gender or race, it becomes important to enforce fairness to avoid bias and discrimination. This has spurred significant interest in developing fair machine learning algorithms. Recently, such algorithms have been developed for monotone submodular maximization under a cardinality constraint. In this paper, we study the natural generalization of this problem to a matroid constraint. We give streaming algorithms as well as impossibility results that provide trade-offs between efficiency, quality and fairness. We validate our findings empirically on a range of well-known real-world applications: exemplar-based clustering, movie recommendation, and maximum coverage in social networks.