Fully Distributed Fog Load Balancing with Multi-Agent Reinforcement Learning
This addresses the need for scalable and adaptive resource management in Fog Computing for IoT systems, though it is incremental as it builds on existing MARL approaches with a focus on distributed implementation and realistic observation assumptions.
The paper tackles the problem of efficiently managing unpredictable traffic demands among heterogeneous Fog resources for real-time IoT applications by proposing a fully distributed load-balancing solution using Multi-Agent Reinforcement Learning (MARL), which minimizes waiting time compared to centralized and baseline methods and analyzes the trade-off between realism and performance with interval-based observations.
Real-time Internet of Things (IoT) applications require real-time support to handle the ever-growing demand for computing resources to process IoT workloads. Fog Computing provides high availability of such resources in a distributed manner. However, these resources must be efficiently managed to distribute unpredictable traffic demands among heterogeneous Fog resources. This paper proposes a fully distributed load-balancing solution with Multi-Agent Reinforcement Learning (MARL) that intelligently distributes IoT workloads to optimize the waiting time while providing fair resource utilization in the Fog network. These agents use transfer learning for life-long self-adaptation to dynamic changes in the environment. By leveraging distributed decision-making, MARL agents effectively minimize the waiting time compared to a single centralized agent solution and other baselines, enhancing end-to-end execution delay. Besides performance gain, a fully distributed solution allows for a global-scale implementation where agents can work independently in small collaboration regions, leveraging nearby local resources. Furthermore, we analyze the impact of a realistic frequency to observe the state of the environment, unlike the unrealistic common assumption in the literature of having observations readily available in real-time for every required action. The findings highlight the trade-off between realism and performance using an interval-based Gossip-based multi-casting protocol against assuming real-time observation availability for every generated workload.