LGNov 27, 2024

Scalable Multi-Objective Reinforcement Learning with Fairness Guarantees using Lorenz Dominance

Dimitris Michailidis, Willem Röpke, Diederik M. Roijers, Sennay Ghebreab, Fernando P. Santos

arXiv:2411.18195v14.62 citationsh-index: 28Has Code

Originality Incremental advance

AI Analysis

This work addresses fairness and scalability challenges in MORL for applications like transport planning, representing an incremental advancement with novel method components.

The paper tackles the computational complexity and fairness issues in Multi-Objective Reinforcement Learning (MORL) by introducing an algorithm that uses Lorenz dominance and λ-Lorenz dominance to ensure equitable reward distributions and improve scalability. It demonstrates improved performance in high-dimensional objective spaces on a new transport planning environment in cities like Xi'an and Amsterdam.

Multi-Objective Reinforcement Learning (MORL) aims to learn a set of policies that optimize trade-offs between multiple, often conflicting objectives. MORL is computationally more complex than single-objective RL, particularly as the number of objectives increases. Additionally, when objectives involve the preferences of agents or groups, ensuring fairness is socially desirable. This paper introduces a principled algorithm that incorporates fairness into MORL while improving scalability to many-objective problems. We propose using Lorenz dominance to identify policies with equitable reward distributions and introduce λ-Lorenz dominance to enable flexible fairness preferences. We release a new, large-scale real-world transport planning environment and demonstrate that our method encourages the discovery of fair policies, showing improved scalability in two large cities (Xi'an and Amsterdam). Our methods outperform common multi-objective approaches, particularly in high-dimensional objective spaces.

View on arXiv PDF Code

Similar