RO AINov 7, 2023

Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers and Docking

Desong Du, Naiming Qi, Yanfang Liu, Wei Pan

arXiv:2311.03680v21.9h-index: 18

Originality Incremental advance

AI Analysis

This addresses safety-critical control for spaceflight missions, representing a domain-specific incremental advance.

The paper tackled autonomous spacecraft proximity maneuvers and docking by introducing a Bayesian actor-critic reinforcement learning algorithm with stability guarantees, achieving impressive performance on a spacecraft air-bearing testbed.

In the pursuit of autonomous spacecraft proximity maneuvers and docking(PMD), we introduce a novel Bayesian actor-critic reinforcement learning algorithm to learn a control policy with the stability guarantee. The PMD task is formulated as a Markov decision process that reflects the relative dynamic model, the docking cone and the cost function. Drawing from the principles of Lyapunov theory, we frame the temporal difference learning as a constrained Gaussian process regression problem. This innovative approach allows the state-value function to be expressed as a Lyapunov function, leveraging the Gaussian process and deep kernel learning. We develop a novel Bayesian quadrature policy optimization procedure to analytically compute the policy gradient while integrating Lyapunov-based stability constraints. This integration is pivotal in satisfying the rigorous safety demands of spaceflight missions. The proposed algorithm has been experimentally evaluated on a spacecraft air-bearing testbed and shows impressive and promising performance.

View on arXiv PDF

Similar