Shaping Rewards, Shaping Routes: On Multi-Agent Deep Q-Networks for Routing in Satellite Constellation Networks
This work addresses routing challenges for satellite networks, which is crucial for handling increasing traffic and integration into 6G, but it appears incremental as it builds on existing deep reinforcement learning methods.
The paper tackles dynamic routing in satellite mega-constellations by investigating multi-agent deep Q-networks, proposing a hybrid centralized learning and decentralized control solution to optimize latency and load balancing, with results showing improved adaptability and robustness in static and dynamic scenarios.
Effective routing in satellite mega-constellations has become crucial to facilitate the handling of increasing traffic loads, more complex network architectures, as well as the integration into 6G networks. To enhance adaptability as well as robustness to unpredictable traffic demands, and to solve dynamic routing environments efficiently, machine learning-based solutions are being considered. For network control problems, such as optimizing packet forwarding decisions according to Quality of Service requirements and maintaining network stability, deep reinforcement learning techniques have demonstrated promising results. For this reason, we investigate the viability of multi-agent deep Q-networks for routing in satellite constellation networks. We focus specifically on reward shaping and quantifying training convergence for joint optimization of latency and load balancing in static and dynamic scenarios. To address identified drawbacks, we propose a novel hybrid solution based on centralized learning and decentralized control.