SY AIMar 28, 2022

Deep Reinforcement Learning Aided Platoon Control Relying on V2X Information

Lei Lei, Tong Liu, Kan Zheng, Lajos Hanzo

arXiv:2203.15781v146 citationsh-index: 70

Originality Incremental advance

AI Analysis

This work addresses platoon control for autonomous vehicles, but it is incremental as it focuses on optimizing information selection within an existing DRL framework.

The paper tackles the problem of optimizing platoon control using deep reinforcement learning by determining which Vehicle-to-Everything (V2X) information to share to balance uncertainty reduction and dimensionality issues, with simulation results illustrating the theoretical analysis.

The impact of Vehicle-to-Everything (V2X) communications on platoon control performance is investigated. Platoon control is essentially a sequential stochastic decision problem (SSDP), which can be solved by Deep Reinforcement Learning (DRL) to deal with both the control constraints and uncertainty in the platoon leading vehicle's behavior. In this context, the value of V2X communications for DRL-based platoon controllers is studied with an emphasis on the tradeoff between the gain of including exogenous information in the system state for reducing uncertainty and the performance erosion due to the curse-of-dimensionality. Our objective is to find the specific set of information that should be shared among the vehicles for the construction of the most appropriate state space. SSDP models are conceived for platoon control under different information topologies (IFT) by taking into account `just sufficient' information. Furthermore, theorems are established for comparing the performance of their optimal policies. In order to determine whether a piece of information should or should not be transmitted for improving the DRL-based control policy, we quantify its value by deriving the conditional KL divergence of the transition models. More meritorious information is given higher priority in transmission, since including it in the state space has a higher probability in offsetting the negative effect of having higher state dimensions. Finally, simulation results are provided to illustrate the theoretical analysis.

View on arXiv PDF

Similar