SYSYOct 20, 2018

Deep Reinforcement Learning for Six Degree-of-Freedom Planetary Powered Descent and Landing

arXiv:1810.0871928 citationsh-index: 33
Originality Incremental advance
AI Analysis

It addresses the need for advanced autonomous landing algorithms for future Mars missions requiring high precision, but the approach is incremental, applying existing RL methods to a specific aerospace problem.

This paper uses reinforcement learning (proximal policy optimization) to develop an integrated guidance and control algorithm for six-degree-of-freedom planetary powered descent and landing, achieving pinpoint accuracy (landing error ellipse < 5 m radius) with fuel-efficient trajectories and robustness to noise and parameter uncertainty.

Future Mars missions will require advanced guidance, navigation, and control algorithms for the powered descent phase to target specific surface locations and achieve pinpoint accuracy (landing error ellipse $<$ 5 m radius). The latter requires both a navigation system capable of estimating the lander's state in real-time and a guidance and control system that can map the estimated lander state to a commanded thrust for each lander engine. In this paper, we present a novel integrated guidance and control algorithm designed by applying the principles of reinforcement learning theory. The latter is used to learn a policy mapping the lander's estimated state directly to a commanded thrust for each engine, with the policy resulting in accurate and fuel-efficient trajectories. Specifically, we use proximal policy optimization, a policy gradient method, to learn the policy. Another contribution of this paper is the use of different discount rates for terminal and shaping rewards, which significantly enhances optimization performance. We present simulation results demonstrating the guidance and control system's performance in a 6-DOF simulation environment and demonstrate robustness to noise and system parameter uncertainty.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes