AIFeb 14, 2021

Long-Term Resource Allocation Fairness in Average Markov Decision Process (AMDP) Environment

Ganesh Ghalme, Vineet Nair, Vishakha Patil, Yilun Zhou

arXiv:2102.07120v38.97 citations

Originality Synthesis-oriented

AI Analysis

This work addresses fairness in automated decision-making for resource allocation, which is important for human welfare, but it is incremental as it applies existing methods to a new fairness formulation in AMDPs.

The paper tackles fairness in long-term resource allocation by proposing a quota-based fairness notion for average Markov Decision Processes (AMDPs), ensuring each state's visitation frequency meets a specified fraction, and develops a Stochastic Mirror Descent algorithm with sample complexity bounds and experimental validation on simulated data.

Fairness has emerged as an important concern in automated decision-making in recent years, especially when these decisions affect human welfare. In this work, we study fairness in temporally extended decision-making settings, specifically those formulated as Markov Decision Processes (MDPs). Our proposed notion of fairness ensures that each state's long-term visitation frequency is at least a specified fraction. This quota-based notion of fairness is natural in many resource-allocation settings where the dynamics of a single resource being allocated is governed by an MDP and the distribution of the shared resource is captured by its state-visitation frequency. In an average-reward MDP (AMDP) setting, we formulate the problem as a bilinear saddle point program and, for a generative model, solve it using a Stochastic Mirror Descent (SMD) based algorithm. The proposed solution guarantees a simultaneous approximation on the expected average-reward and fairness requirement. We give sample complexity bounds for the proposed algorithm and validate our theoretical results with experiments on simulated data.

View on arXiv PDF

Similar