LG SYMay 20, 2024

Investigating the Impact of Choice on Deep Reinforcement Learning for Space Controls

Nathaniel Hamilton, Kyle Dunlap, Kerianne L. Hobbs

arXiv:2405.12355v12.61 citationsh-index: 8SMC-IT

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of autonomous control for space applications, such as inspection and docking, by evaluating action space design, but it is incremental as it compares existing RL methods without introducing new algorithms.

The paper investigates the impact of discrete versus continuous action spaces in deep reinforcement learning for space control tasks, finding that a limited number of discrete choices yields optimal performance for an inspection task, while continuous control is best for a docking task.

For many space applications, traditional control methods are often used during operation. However, as the number of space assets continues to grow, autonomous operation can enable rapid development of control methods for different space related tasks. One method of developing autonomous control is Reinforcement Learning (RL), which has become increasingly popular after demonstrating promising performance and success across many complex tasks. While it is common for RL agents to learn bounded continuous control values, this may not be realistic or practical for many space tasks that traditionally prefer an on/off approach for control. This paper analyzes using discrete action spaces, where the agent must choose from a predefined list of actions. The experiments explore how the number of choices provided to the agents affects their measured performance during and after training. This analysis is conducted for an inspection task, where the agent must circumnavigate an object to inspect points on its surface, and a docking task, where the agent must move into proximity of another spacecraft and "dock" with a low relative speed. A common objective of both tasks, and most space tasks in general, is to minimize fuel usage, which motivates the agent to regularly choose an action that uses no fuel. Our results show that a limited number of discrete choices leads to optimal performance for the inspection task, while continuous control leads to optimal performance for the docking task.

View on arXiv PDF

Similar