Adaptive Aggregation for Safety-Critical Control
This work addresses safety-critical control in reinforcement learning, which is a central obstacle for real-world deployment, but it is incremental as it builds on existing transfer learning and safety methods.
The paper tackles the problem of improving learning efficiency and safety in reinforcement learning for real-world applications by proposing an adaptive aggregation framework that transfers safety knowledge from multiple source tasks to a target task, resulting in fewer safety violations and better data efficiency compared to baselines.
Safety has been recognized as the central obstacle to preventing the use of reinforcement learning (RL) for real-world applications. Different methods have been developed to deal with safety concerns in RL. However, learning reliable RL-based solutions usually require a large number of interactions with the environment. Likewise, how to improve the learning efficiency, specifically, how to utilize transfer learning for safe reinforcement learning, has not been well studied. In this work, we propose an adaptive aggregation framework for safety-critical control. Our method comprises two key techniques: 1) we learn to transfer the safety knowledge by aggregating the multiple source tasks and a target task through the attention network; 2) we separate the goal of improving task performance and reducing constraint violations by utilizing a safeguard. Experiment results demonstrate that our algorithm can achieve fewer safety violations while showing better data efficiency compared with several baselines.