LGAug 21, 2020
Congested Urban Networks Tend to Be Insensitive to Signal Settings: Implications for Learning-Based ControlJorge Laval, Hao Zhou
This paper highlights several properties of large urban networks that can have an impact on machine learning methods applied to traffic signal control. In particular, we show that the average network flow tends to be independent of the signal control policy as density increases. This property, which so far has remained under the radar, implies that deep reinforcement learning (DRL) methods becomes ineffective when trained under congested conditions, and might explain DRL's limited success for traffic signal control. Our results apply to all possible grid networks thanks to a parametrization based on two network parameters: the ratio of the expected distance between consecutive traffic lights to the expected green time, and the turning probability at intersections. Networks with different parameters exhibit very different responses to traffic signal control. Notably, we found that no control (i.e. random policy) can be an effective control strategy for a surprisingly large family of networks. The impact of the turning probability turned out to be very significant both for baseline and for DRL policies. It also explains the loss of symmetry observed for these policies, which is not captured by existing theories that rely on corridor approximations without turns. Our findings also suggest that supervised learning methods have enormous potential as they require very little examples to produce excellent policies.
SPOct 2, 2019
Review of Learning-based Longitudinal Motion Planning for Autonomous Vehicles: Research Gaps between Self-driving and Traffic CongestionHao Zhou, Jorge Laval, Anye Zhou et al.
Self-driving technology companies and the research community are accelerating their pace to use machine learning longitudinal motion planning (mMP) for autonomous vehicles (AVs). This paper reviews the current state of the art in mMP, with an exclusive focus on its impact on traffic congestion. We identify the availability of congestion scenarios in current datasets, and summarize the required features for training mMP. For learning methods, we survey the major methods in both imitation learning and non-imitation learning. We also highlight the emerging technologies adopted by some leading AV companies, e.g. Tesla, Waymo, and Comma.ai. We find that: i) the AV industry has been mostly focusing on the long tail problem related to safety and overlooked the impact on traffic congestion, ii) the current public self-driving datasets have not included enough congestion scenarios, and mostly lack the necessary input features/output labels to train mMP, and iii) albeit reinforcement learning (RL) approach can integrate congestion mitigation into the learning goal, the major mMP method adopted by industry is still behavior cloning (BC), whose capability to learn a congestion-mitigating mMP remains to be seen. Based on the review, the study identifies the research gaps in current mMP development. Some suggestions towards congestion mitigation for future mMP studies are proposed: i) enrich data collection to facilitate the congestion learning, ii) incorporate non-imitation learning methods to combine traffic efficiency into a safety-oriented technical route, and iii) integrate domain knowledge from the traditional car following (CF) theory to improve the string stability of mMP.