Gengyue Han

LG
3papers
15citations
Novelty60%
AI Score40

3 Papers

40.2LGMay 26
Transferable Reinforcement Learning via Probabilistic Latent Embeddings and Dynamic Policy Adaptation for Sim-to-Real Deployment

Gengyue Han, Yiheng Feng

Due to limited resources and public safety concerns, deep reinforcement learning (RL) agents for many cyber-physical systems (e.g., autonomous vehicles) are first trained in simulators. However, when deployed in real world environments, they often suffer from performance degradation or safety violations because of the inevitable Sim2Real gap. Existing zero-shot approaches, such as robust safe RL and domain randomization, mitigate this issue but typically at the cost of degraded performance or residual safety risks when experiencing unmodeled system dynamics. To address these limitations, we propose a novel reinforcement learning framework that enables safe and efficient policy transfer via probabilistic latent embeddings and dynamic policy adaptation. We consider a family of Constrained Markov Decision Processes (CMDPs) under different environment contexts. By leveraging latent context variable in meta-RL, the proposed framework infers the latent representation of the environment from simulated experiences. Furthermore, it incorporates a distributional RL formulation, which allows risk levels of the deployed policy to be adjusted dynamically, based on the estimation accuracy of the latent context variable. This strategy promotes safety at the early deployment stage and improves efficiency through fast policy adaptation under the Sim2Real gap.

SYOct 16, 2023
Joint Optimization of Traffic Signal Control and Vehicle Routing in Signalized Road Networks using Multi-Agent Deep Reinforcement Learning

Xianyue Peng, Hang Gao, Gengyue Han et al.

Urban traffic congestion is a critical predicament that plagues modern road networks. To alleviate this issue and enhance traffic efficiency, traffic signal control and vehicle routing have proven to be effective measures. In this paper, we propose a joint optimization approach for traffic signal control and vehicle routing in signalized road networks. The objective is to enhance network performance by simultaneously controlling signal timings and route choices using Multi-Agent Deep Reinforcement Learning (MADRL). Signal control agents (SAs) are employed to establish signal timings at intersections, whereas vehicle routing agents (RAs) are responsible for selecting vehicle routes. By establishing relevance between agents and enabling them to share observations and rewards, interaction and cooperation among agents are fostered, which enhances individual training. The Multi-Agent Advantage Actor-Critic algorithm is used to handle multi-agent environments, and Deep Neural Network (DNN) structures are designed to facilitate the algorithm's convergence. Notably, our work is the first to utilize MADRL in determining the optimal joint policy for signal control and vehicle routing. Numerical experiments conducted on the modified Sioux network demonstrate that our integration of signal control and vehicle routing outperforms controlling signal timings or vehicles' routes alone in enhancing traffic efficiency.

LGJan 16, 2024
CycLight: learning traffic signal cooperation with a cycle-level strategy

Gengyue Han, Xiaohan Liu, Xianyue Peng et al.

This study introduces CycLight, a novel cycle-level deep reinforcement learning (RL) approach for network-level adaptive traffic signal control (NATSC) systems. Unlike most traditional RL-based traffic controllers that focus on step-by-step decision making, CycLight adopts a cycle-level strategy, optimizing cycle length and splits simultaneously using Parameterized Deep Q-Networks (PDQN) algorithm. This cycle-level approach effectively reduces the computational burden associated with frequent data communication, meanwhile enhancing the practicality and safety of real-world applications. A decentralized framework is formulated for multi-agent cooperation, while attention mechanism is integrated to accurately assess the impact of the surroundings on the current intersection. CycLight is tested in a large synthetic traffic grid using the microscopic traffic simulation tool, SUMO. Experimental results not only demonstrate the superiority of CycLight over other state-of-the-art approaches but also showcase its robustness against information transmission delays.