LG MLJun 11, 2020

Group-Fair Online Allocation in Continuous Time

Semih Cayci, Swati Gupta, Atilla Eryilmaz

arXiv:2006.06852v25.823 citations

Originality Incremental advance

AI Analysis

This work addresses fairness and efficiency in sequential decision-making for applications like online hiring and server allocation, though it is incremental as it extends existing online learning theory to continuous time with group fairness.

The paper tackles the problem of online allocation with fairness constraints in continuous-time settings, such as freelancing platforms and cloud computing, where task completion times are random and action-dependent. It introduces a framework that recovers various fair allocation rules and proposes an online algorithm achieving $ ilde{O}(B^{-1/2})$ regret without prior statistical knowledge.

The theory of discrete-time online learning has been successfully applied in many problems that involve sequential decision-making under uncertainty. However, in many applications including contractual hiring in online freelancing platforms and server allocation in cloud computing systems, the outcome of each action is observed only after a random and action-dependent time. Furthermore, as a consequence of certain ethical and economic concerns, the controller may impose deadlines on the completion of each task, and require fairness across different groups in the allocation of total time budget $B$. In order to address these applications, we consider continuous-time online learning problem with fairness considerations, and present a novel framework based on continuous-time utility maximization. We show that this formulation recovers reward-maximizing, max-min fair and proportionally fair allocation rules across different groups as special cases. We characterize the optimal offline policy, which allocates the total time between different actions in an optimally fair way (as defined by the utility function), and impose deadlines to maximize time-efficiency. In the absence of any statistical knowledge, we propose a novel online learning algorithm based on dual ascent optimization for time averages, and prove that it achieves $\tilde{O}(B^{-1/2})$ regret bound.

View on arXiv PDF

Similar