LG MA SYDec 14, 2023

Global Rewards in Multi-Agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems

Heiko Hoppe, Tobias Enders, Quentin Cappart, Maximilian Schiffer

arXiv:2312.08884v210.713 citationsh-index: 4Has CodeL4DC

Originality Incremental advance

AI Analysis

This addresses performance issues in scalable vehicle dispatching for mobility services, though it appears incremental by modifying reward structures in existing multi-agent frameworks.

The paper tackles vehicle dispatching in autonomous mobility on demand systems by proposing a global-rewards-based multi-agent deep reinforcement learning algorithm, which shows statistically significant improvements over state-of-the-art methods with local rewards on real-world data.

We study vehicle dispatching in autonomous mobility on demand (AMoD) systems, where a central operator assigns vehicles to customer requests or rejects these with the aim of maximizing its total profit. Recent approaches use multi-agent deep reinforcement learning (MADRL) to realize scalable yet performant algorithms, but train agents based on local rewards, which distorts the reward signal with respect to the system-wide profit, leading to lower performance. We therefore propose a novel global-rewards-based MADRL algorithm for vehicle dispatching in AMoD systems, which resolves so far existing goal conflicts between the trained agents and the operator by assigning rewards to agents leveraging a counterfactual baseline. Our algorithm shows statistically significant improvements across various settings on real-world data compared to state-of-the-art MADRL algorithms with local rewards. We further provide a structural analysis which shows that the utilization of global rewards can improve implicit vehicle balancing and demand forecasting abilities. Our code is available at https://github.com/tumBAIS/GR-MADRL-AMoD.

View on arXiv PDF Code

Similar