MIX-MAB: Reinforcement Learning-based Resource Allocation Algorithm for LoRaWAN
This addresses resource allocation for end devices in LoRaWAN networks, but it is incremental as it builds on existing multi-armed bandit and reinforcement learning methods.
The paper tackled the problem of improving packet delivery ratio in LoRaWAN by proposing a reinforcement learning-based resource allocation algorithm called MIX-MAB, which uses a two-phase approach combining EXP3 and successive elimination algorithms, and showed better performance in convergence time and PDR compared to existing schemes.
This paper focuses on improving the resource allocation algorithm in terms of packet delivery ratio (PDR), i.e., the number of successfully received packets sent by end devices (EDs) in a long-range wide-area network (LoRaWAN). Setting the transmission parameters significantly affects the PDR. Employing reinforcement learning (RL), we propose a resource allocation algorithm that enables the EDs to configure their transmission parameters in a distributed manner. We model the resource allocation problem as a multi-armed bandit (MAB) and then address it by proposing a two-phase algorithm named MIX-MAB, which consists of the exponential weights for exploration and exploitation (EXP3) and successive elimination (SE) algorithms. We evaluate the MIX-MAB performance through simulation results and compare it with other existing approaches. Numerical results show that the proposed solution performs better than the existing schemes in terms of convergence time and PDR.