LG AI RONov 24, 2022

Explainable and Safe Reinforcement Learning for Autonomous Air Mobility

Lei Wang, Hongyu Yang, Yi Lin, Suwan Yin, Yuankai Wu

arXiv:2211.13474v11.87 citationsh-index: 27Has Code

Originality Incremental advance

AI Analysis

This addresses safety and explainability issues for autonomous air mobility systems, which is an incremental improvement in a domain-specific application.

The paper tackles the problem of explainability and safety in deep reinforcement learning (DRL) controllers for autonomous air traffic control, proposing a framework that decouples safety and efficiency learning and demonstrates improved performance and explainability in simulations, with an adversarial attack strategy showing increased collisions with fewer attacks.

Increasing traffic demands, higher levels of automation, and communication enhancements provide novel design opportunities for future air traffic controllers (ATCs). This article presents a novel deep reinforcement learning (DRL) controller to aid conflict resolution for autonomous free flight. Although DRL has achieved important advancements in this field, the existing works pay little attention to the explainability and safety issues related to DRL controllers, particularly the safety under adversarial attacks. To address those two issues, we design a fully explainable DRL framework wherein we: 1) decompose the coupled Q value learning model into a safety-awareness and efficiency (reach the target) one; and 2) use information from surrounding intruders as inputs, eliminating the needs of central controllers. In our simulated experiments, we show that by decoupling the safety-awareness and efficiency, we can exceed performance on free flight control tasks while dramatically improving explainability on practical. In addition, the safety Q learning module provides rich information about the safety situation of environments. To study the safety under adversarial attacks, we additionally propose an adversarial attack strategy that can impose both safety-oriented and efficiency-oriented attacks. The adversarial aims to minimize safety/efficiency by only attacking the agent at a few time steps. In the experiments, our attack strategy increases as many collisions as the uniform attack (i.e., attacking at every time step) by only attacking the agent four times less often, which provide insights into the capabilities and restrictions of the DRL in future ATC designs. The source code is publicly available at https://github.com/WLeiiiii/Gym-ATC-Attack-Project.

View on arXiv PDF Code

Similar