LGJun 23, 2023

Trading-off price for data quality to achieve fair online allocation

arXiv:2306.13440v25 citationsh-index: 27
Originality Incremental advance
AI Analysis

This work addresses fairness in online allocation for decision-makers who cannot directly observe protected attributes, offering a practical solution by trading off cost for data quality, though it is incremental in combining bandit and allocation problems.

The paper tackles the problem of online allocation with long-term fairness constraints when protected attributes are unobserved, by allowing the purchase of data from varying quality sources to estimate these attributes. The authors propose an algorithm that jointly selects data sources and performs allocations, achieving a regret bound of O(√T) and adapting to different fairness notions.

We consider the problem of online allocation subject to a long-term fairness penalty. Contrary to existing works, however, we do not assume that the decision-maker observes the protected attributes -- which is often unrealistic in practice. Instead they can purchase data that help estimate them from sources of different quality; and hence reduce the fairness penalty at some cost. We model this problem as a multi-armed bandit problem where each arm corresponds to the choice of a data source, coupled with the online allocation problem. We propose an algorithm that jointly solves both problems and show that it has a regret bounded by $\mathcal{O}(\sqrt{T})$. A key difficulty is that the rewards received by selecting a source are correlated by the fairness penalty, which leads to a need for randomization (despite a stochastic setting). Our algorithm takes into account contextual information available before the source selection, and can adapt to many different fairness notions. We also show that in some instances, the estimates used can be learned on the fly.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes