LG MLJun 5, 2020

Concurrent Decentralized Channel Allocation and Access Point Selection using Multi-Armed Bandits in multi BSS WLANs

arXiv:2006.03350v11.2

Originality Synthesis-oriented

AI Analysis

This work addresses performance optimization in enterprise wireless networks, but it is incremental as it applies known reinforcement learning techniques to a specific domain.

The paper tackles the problem of optimizing channel allocation and access point selection in enterprise WLANs by using Multi-Armed Bandits with Thompson sampling, showing that this adaptive framework outperforms static approaches in various network scenarios and reduces performance variability.

Enterprise Wireless Local Area Networks (WLANs) consist of multiple Access Points (APs) covering a given area. Finding a suitable network configuration able to maximize the performance of enterprise WLANs is a challenging task given the complex dependencies between APs and stations. Recently, in wireless networking, the use of reinforcement learning techniques has emerged as an effective solution to efficiently explore the impact of different network configurations in the system performance, identifying those that provide better performance. In this paper, we study if Multi-Armed Bandits (MABs) are able to offer a feasible solution to the decentralized channel allocation and AP selection problems in Enterprise WLAN scenarios. To do so, we empower APs and stations with agents that, by means of implementing the Thompson sampling algorithm, explore and learn which is the best channel to use, and which is the best AP to associate, respectively. Our evaluation is performed over randomly generated scenarios, which enclose different network topologies and traffic loads. The presented results show that the proposed adaptive framework using MABs outperform the static approach (i.e., using always the initial default configuration, usually random) regardless of the network density and the traffic requirements. Moreover, we show that the use of the proposed framework reduces the performance variability between different scenarios. Results also show that we achieve the same performance (or better) than static strategies with less APs for the same number of stations. Finally, special attention is placed on how the agents interact. Even if the agents operate in a completely independent manner, their decisions have interrelated effects, as they take actions over the same set of channel resources.

View on arXiv PDF

Similar