SYAILGNov 23, 2022

Reinforcement learning for traffic signal control in hybrid action space

arXiv:2211.12956v224 citationsh-index: 21
Originality Incremental advance
AI Analysis

This addresses traffic congestion for urban planners and drivers, representing a domain-specific incremental improvement.

The paper tackles traffic signal control by proposing TBO, a reinforcement learning algorithm that synchronously optimizes both staging and duration in a hybrid action space, reducing average queue length by 13.78% and delay by 14.08% compared to baselines while maintaining fairness.

The prevailing reinforcement-learning-based traffic signal control methods are typically staging-optimizable or duration-optimizable, depending on the action spaces. In this paper, we propose a novel control architecture, TBO, which is based on hybrid proximal policy optimization. To the best of our knowledge, TBO is the first RL-based algorithm to implement synchronous optimization of the staging and duration. Compared to discrete and continuous action spaces, hybrid action space is a merged search space, in which TBO better implements the trade-off between frequent switching and unsaturated release. Experiments are given to demonstrate that TBO reduces the queue length and delay by 13.78% and 14.08% on average, respectively, compared to the existing baselines. Furthermore, we calculate the Gini coefficients of the right-of-way to indicate TBO does not harm fairness while improving efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes