Agent-Agnostic Centralized Training for Decentralized Multi-Agent Cooperative Driving
This work addresses traffic congestion and flow issues for autonomous vehicle systems, representing an incremental advancement in decentralized multi-agent reinforcement learning methods.
The paper tackles the problem of infinite-horizon traffic flow and partial observability in decentralized multi-agent cooperative driving for autonomous vehicles by proposing an asymmetric actor-critic model with attention neural networks, resulting in improved traffic flow at bottleneck points without compromising safety.
Active traffic management with autonomous vehicles offers the potential for reduced congestion and improved traffic flow. However, developing effective algorithms for real-world scenarios requires overcoming challenges related to infinite-horizon traffic flow and partial observability. To address these issues and further decentralize traffic management, we propose an asymmetric actor-critic model that learns decentralized cooperative driving policies for autonomous vehicles using single-agent reinforcement learning. By employing attention neural networks with masking, our approach efficiently manages real-world traffic dynamics and partial observability, eliminating the need for predefined agents or agent-specific experience buffers in multi-agent reinforcement learning. Extensive evaluations across various traffic scenarios demonstrate our method's significant potential in improving traffic flow at critical bottleneck points. Moreover, we address the challenges posed by conservative autonomous vehicle driving behaviors that adhere strictly to traffic rules, showing that our cooperative policy effectively alleviates potential slowdowns without compromising safety.