SYROJan 12, 2019

Adaptive Guidance with Reinforcement Meta-Learning

arXiv:1901.04473v116 citations
AI Analysis

This addresses the problem of robust autonomous navigation for spacecraft and robotics in unpredictable environments, representing a strong domain-specific advancement.

The paper tackled the problem of autonomous guidance in environments with unknown, highly variable dynamics by developing an adaptive guidance system using reinforcement meta-learning with recurrent policies. The results showed the recurrent policy outperformed non-recurrent baselines in tasks like Mars landing with random engine failure, achieving a 15% higher success rate and enabling integrated guidance and navigation using only Doppler radar altimeter returns.

This paper proposes a novel adaptive guidance system developed using reinforcement meta-learning with a recurrent policy and value function approximator. The use of recurrent network layers allows the deployed policy to adapt real time to environmental forces acting on the agent. We compare the performance of the DR/DV guidance law, an RL agent with a non-recurrent policy, and an RL agent with a recurrent policy in four difficult tasks with unknown but highly variable dynamics. These tasks include a safe Mars landing with random engine failure and a landing on an asteroid with unknown environmental dynamics. We also demonstrate the ability of a recurrent policy to navigate using only Doppler radar altimeter returns, thus integrating guidance and navigation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes