LG CENov 14, 2023

Purpose in the Machine: Do Traffic Simulators Produce Distributionally Equivalent Outcomes for Reinforcement Learning Applications?

Rex Chen, Kathleen M. Carley, Fei Fang, Norman Sadeh

arXiv:2311.08429v12.02 citationsh-index: 22

Originality Synthesis-oriented

AI Analysis

This work highlights a critical problem for researchers and practitioners in intelligent transportation systems, as it shows that simulator choice can impact RL training and deployment, though it is incremental in addressing simulation validation.

The study investigated whether traffic simulators CityFlow and SUMO produce distributionally equivalent outcomes for reinforcement learning applications, finding significant differences in RL-relevant measures with root mean squared error and KL divergence greater than zero for all assessed measures.

Traffic simulators are used to generate data for learning in intelligent transportation systems (ITSs). A key question is to what extent their modelling assumptions affect the capabilities of ITSs to adapt to various scenarios when deployed in the real world. This work focuses on two simulators commonly used to train reinforcement learning (RL) agents for traffic applications, CityFlow and SUMO. A controlled virtual experiment varying driver behavior and simulation scale finds evidence against distributional equivalence in RL-relevant measures from these simulators, with the root mean squared error and KL divergence being significantly greater than 0 for all assessed measures. While granular real-world validation generally remains infeasible, these findings suggest that traffic simulators are not a deus ex machina for RL training: understanding the impacts of inter-simulator differences is necessary to train and deploy RL-based ITSs.

View on arXiv PDF

Similar