AIFeb 14, 2021

Long-Term Resource Allocation Fairness in Average Markov Decision Process (AMDP) Environment

arXiv:2102.07120v37 citations
Originality Synthesis-oriented
AI Analysis

This work addresses fairness in automated decision-making for resource allocation, which is important for human welfare, but it is incremental as it applies existing methods to a new fairness formulation in AMDPs.

The paper tackles fairness in long-term resource allocation by proposing a quota-based fairness notion for average Markov Decision Processes (AMDPs), ensuring each state's visitation frequency meets a specified fraction, and develops a Stochastic Mirror Descent algorithm with sample complexity bounds and experimental validation on simulated data.

Fairness has emerged as an important concern in automated decision-making in recent years, especially when these decisions affect human welfare. In this work, we study fairness in temporally extended decision-making settings, specifically those formulated as Markov Decision Processes (MDPs). Our proposed notion of fairness ensures that each state's long-term visitation frequency is at least a specified fraction. This quota-based notion of fairness is natural in many resource-allocation settings where the dynamics of a single resource being allocated is governed by an MDP and the distribution of the shared resource is captured by its state-visitation frequency. In an average-reward MDP (AMDP) setting, we formulate the problem as a bilinear saddle point program and, for a generative model, solve it using a Stochastic Mirror Descent (SMD) based algorithm. The proposed solution guarantees a simultaneous approximation on the expected average-reward and fairness requirement. We give sample complexity bounds for the proposed algorithm and validate our theoretical results with experiments on simulated data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes