LGMAMLAug 10, 2019

Large-Scale Traffic Signal Control Using a Novel Multi-Agent Reinforcement Learning

arXiv:1908.03761v2144 citations
AI Analysis

This addresses traffic congestion in urban road systems, representing an incremental improvement in multi-agent reinforcement learning for domain-specific applications.

The paper tackles large-scale traffic signal control by proposing a novel multi-agent reinforcement learning method called Cooperative double Q-learning (Co-DQL), which reduces average vehicle waiting time and outperforms state-of-the-art decentralized MARL algorithms in simulations.

Finding the optimal signal timing strategy is a difficult task for the problem of large-scale traffic signal control (TSC). Multi-Agent Reinforcement Learning (MARL) is a promising method to solve this problem. However, there is still room for improvement in extending to large-scale problems and modeling the behaviors of other agents for each individual agent. In this paper, a new MARL, called Cooperative double Q-learning (Co-DQL), is proposed, which has several prominent features. It uses a highly scalable independent double Q-learning method based on double estimators and the UCB policy, which can eliminate the over-estimation problem existing in traditional independent Q-learning while ensuring exploration. It uses mean field approximation to model the interaction among agents, thereby making agents learn a better cooperative strategy. In order to improve the stability and robustness of the learning process, we introduce a new reward allocation mechanism and a local state sharing method. In addition, we analyze the convergence properties of the proposed algorithm. Co-DQL is applied on TSC and tested on a multi-traffic signal simulator. According to the results obtained on several traffic scenarios, Co- DQL outperforms several state-of-the-art decentralized MARL algorithms. It can effectively shorten the average waiting time of the vehicles in the whole road system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes