RO AI SESep 14, 2021

Dependability Analysis of Deep Reinforcement Learning based Robotics and Autonomous Systems through Probabilistic Model Checking

arXiv:2109.06523v38.99 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the dependability problem for DRL-based robotics and autonomous systems, offering a formal assessment method that is incremental in applying existing model checking techniques to a new domain.

The paper tackles the challenge of assessing dependability in Deep Reinforcement Learning (DRL)-driven Robotics and Autonomous Systems (RAS) by formally defining dependability properties and using Probabilistic Model Checking (PMC) on a Discrete-Time Markov Chain (DTMC) model. The results show the method is effective as a holistic assessment framework, uncovering property conflicts and revealing that standard DRL training cannot improve dependability, requiring bespoke optimization objectives.

While Deep Reinforcement Learning (DRL) provides transformational capabilities to the control of Robotics and Autonomous Systems (RAS), the black-box nature of DRL and uncertain deployment environments of RAS pose new challenges on its dependability. Although existing works impose constraints on the DRL policy to ensure successful completion of the mission, it is far from adequate to assess the DRL-driven RAS in a holistic way considering all dependability properties. In this paper, we formally define a set of dependability properties in temporal logic and construct a Discrete-Time Markov Chain (DTMC) to model the dynamics of risk/failures of a DRL-driven RAS interacting with the stochastic environment. We then conduct Probabilistic Model Checking (PMC) on the designed DTMC to verify those properties. Our experimental results show that the proposed method is effective as a holistic assessment framework while uncovering conflicts between the properties that may need trade-offs in training. Moreover, we find that the standard DRL training cannot improve dependability properties, thus requiring bespoke optimisation objectives. Finally, our method offers sensitivity analysis of dependability properties to disturbance levels from environments, providing insights for the assurance of real RAS.

View on arXiv PDF Code

Similar